Interaction trap systems for detecting protein interactions

ABSTRACT

Disclosed herein is a method of determining whether a first protein is capable of physically interacting with a second protein, involving: (a) providing a host cell which contains (i) a reporter gene operably linked to a protein binding site; (ii) a first fusion gene which expresses a first fusion protein, the first fusion protein including the first protein covalently bonded to a binding moiety which is capable of specifically binding to the protein binding site; and (iii) a second fusion gene which expresses a second fusion protein, the second fusion protein including the second protein covalently bonded to a gene activating moiety and being conformationally-constrained; and (b) measuring expression of the reporter gene as a measure of an interaction between the first and the second proteins. Also disclosed are methods for assaying protein interactions, and identifying antagonists and agonists of protein interactions. Proteins isolated by these methods are also discussed. Finally, populations of eukaryotic cells are disclosed, each cell having a recombinant DNA molecule encoding a conformationally-constrained intracellular peptide.

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation-in-part of U.S. Ser. No.08/504,538, filed Jul. 20, 1995, which is a continuation-in-part of U.S.Ser. No. 08/278,082, filed Jul. 20, 1994.

BACKGROUND OF THE INVENTION

[0002] This invention relates to methods for detecting proteininteractions and isolating novel proteins.

SUMMARY OF THE INVENTION

[0003] In general, the invention features methods for detectinginteractions among proteins.

[0004] Accordingly, in one aspect, the invention features a method ofdetermining whether a first protein is capable of physically interactingwith a second protein. The method includes (a) providing a host cellwhich contains (i) a reporter gene operably linked to aDNA-binding-protein recognition site; (ii) a first fusion gene whichexpresses a first fusion protein, the first fusion protein comprisingthe first protein covalently bonded to a binding moiety which is capableof specifically binding to the DNA-binding-protein recognition site; and(iii) a second fusion gene which expresses a second fusion protein, thesecond fusion protein including the second protein covalently bonded toa gene activating moiety and being conformationally-constrained; and (b)measuring expression of the reporter gene as a measure of an interactionbetween the first and said second proteins.

[0005] Preferably, the second protein is a short peptide of at least 6amino acids in length and is less than or equal to 60 amino acids inlength; includes a randomly generated or intentionally designed peptidesequence; includes one or more loops; or is conformationally-constrainedas a result of covalent bonding to a conformation-constraining protein,e.g., thioredoxin or a thioredoxin-like molecule. Where the secondprotein is covalently bonded to a conformationally constraining proteinthe invention features a polypeptide wherein the second protein isembedded within the conformation-constraining protein to which it iscovalently bonded. Where the conformation-constraining protein isthioredoxin, the invention also features an additional method whichincludes a second protein which is conformationally-constrained bydisulfide bonds between cysteine residues in the amino-terminus and inthe carboxy-terminus of the second protein.

[0006] In another aspect, the invention features a method of detectingan interacting protein in a population of proteins, comprising: (a)providing a host cell which contains (i) a reporter gene operably linkedto a DNA-binding-protein recognition site; and (ii) a fusion gene whichexpresses a fusion protein, the fusion protein including a test proteincovalently bonded to a binding moiety which is capable of specificallybinding to the DNA-binding-protein recognition site; (b) introducinginto the host cell a second fusion gene which expresses a second fusionprotein, the second fusion protein including one of said population ofproteins covalently bonded to a gene activating moiety and beingconformationally-constrained; and (c) measuring expression of thereporter gene. Preferably, the population of proteins includes shortpeptides of between 1 and 60 amino acids in length.

[0007] The invention also features a method of detecting an interactingprotein within a population wherein the population of proteins is a setof randomly generated or intentionally designed peptide sequences, orwhere the population of proteins is conformationally-constrained bycovalently bonding to a conformation-constraining protein. Preferably,where the population of proteins is conformationally-constrained bycovalent bonding to a conformation-constraining protein, the populationof proteins is embedded within the conformation-constraining protein.The invention further features a method of detecting an interactingprotein within a population wherein the conformation-constrainingprotein is thioredoxin. Preferably, the population of proteins isinserted into the active site loop of the thioredoxin.

[0008] The invention further features a method wherein each of thepopulation of proteins is conformationally-constrained by disulfidebonds between cysteine residues in the amino-terminus and in thecarboxy-terminus of said protein.

[0009] In preferred embodiments of various aspects, the host cell isyeast; the DNA binding domain is LexA; the interacting protein includesone or more loops; and/or the reporter gene is assayed by a colorreaction or by cell viability.

[0010] In other embodiments the bait may be Cdk2 or a Ras proteinsequence.

[0011] In another related aspect, the invention features a method ofidentifying a candidate interactor. The method includes (a) providing areporter gene operably linked to a DNA-binding-protein recognition site;(b) providing a first fusion protein, which includes a first proteincovalently bonded to a binding moiety which is capable of specificallybinding to the DNA-binding-protein recognition site; (c) providing asecond fusion protein, which includes a second protein covalently bondedto a gene activating moiety and being conformationally-constrained, thesecond protein being capable of interacting with said first protein; (d)contacting said candidate interactor with said first protein and/or saidsecond protein; and (e) measuring expression of said reporter gene.

[0012] The invention features a method of identifying a candidateinteractor wherein the first fusion protein is provided by providing afirst fusion gene which expresses the first fusion protein and whereinthe second fusion protein is provided by providing a second fusion genewhich expresses said second fusion protein. Alternatively, the reportergene, the first fusion gene, and the second fusion gene are included ona single piece of DNA.

[0013] The invention also features a method of identifying candidateinteractors wherein the first fusion protein and the second fusionprotein are permitted to interact prior to contact with said candidateinteractor, and a related method wherein the first fusion protein andthe candidate interactor are permitted to interact prior to contact withsaid second fusion protein.

[0014] In a preferred embodiment, the candidate interactor isconformationally-constrained and may include one or more loops. Wherethe candidate interactor is an antagonist, reporter gene expression isreduced. Where the candidate interactor is an agonist, reporter geneexpression is increased. The candidate interactor is a member selectedfrom the group consisting of proteins, polynucleotides, and smallmolecules. In addition, a candidate interactor can be encoded by amember of a CDNA or synthetic DNA library. Moreover, the candidateinteractor can be a mutated form of said first fusion protein or saidsecond fusion protein.

[0015] In a preferred embodiment of any of the above aspects, thecandidate interactor is isolated in vitro and shown to function in vivo,i.e., as a conformationally constrained intracellular peptide.

[0016] In a related aspect, the invention features a population ofeukaryotic cells, each cell having a recombinant DNA molecule encoding aconformationally-constrained intracellular peptide, there being at least100 different recombinant molecules in the population, each moleculebeing in at least one cell of said population.

[0017] Preferably, the intracellular peptides within the population ofcells are conformationally-constrained because they are covalentlybonded to a conformation-constraining protein.

[0018] In preferred embodiments the intracellular peptide is embeddedwithin the conformation-constraining protein, preferably thioredoxin;the intracellular peptide is conformationally-constrained by disulfidebonds between cysteine residues in the amino-terminus and in thecarboxy-terminus of said second protein; the intracellular peptideincludes one or more loops; the population of eukaryotic cells are yeastcells; the recombinant DNA molecule further encodes a gene activatingmoiety covalently bonded to said intracellular peptide; and/or theintracellular peptide physically interacts with a second recombinantprotein inside said eukaryotic cells.

[0019] In another aspect, the invention features a method of assaying aninteraction between a first protein and a second protein. The methodincludes: (a) providing a reporter gene operably linked to aDNA-binding-protein recognition site; (b) providing a first fusionprotein including a first protein covalently bonded to a binding moietywhich is capable of specifically binding to the DNA-binding-proteinrecognition site; (c) providing a second fusion protein including asecond protein which is conformationally constrained (and may includeone or more loops) and is covalently bonded to a gene activating moiety;(d) combining the reporter gene, the first fusion protein, and thesecond fusion protein; and (e) measuring expression of the reportergene.

[0020] In a preferred embodiment, the invention further features amethod of assaying the interaction between two proteins wherein thefirst fusion protein is provided by providing a first fusion gene whichexpresses the first fusion protein and wherein the second fusion proteinis provided by providing a second fusion gene which expresses the secondfusion protein. In another preferred embodiment, the interaction isassayed in vitro and shown to function in vivo, i.e., as aconformationally constrained intracellular peptide.

[0021] In yet other aspects, the invention features a protein includingthe sequenceLeu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Leu-Phe(SEQ ID NO: 1), preferably conformationally-constrained; proteinincluding the sequenceMet-Val-Val-Ala-Ala-Glu-Ala-Val-Arg-Thr-Val-Leu-Leu-Ala-Asp-Gly-Gly-Asp-Val-Thr(SEQ ID NO: 2); preferably conformationally-constrained; a proteinincluding the sequencePro-Asn-Trp-Pro-His-Gln-Leu-Arg-Val-Gly-Arg-Val-Leu-Trp-Glu-Arg-Leu-Ser-Phe-Glu(SEQ ID NO: 3), preferably conformationally-constrained; a proteinincluding the sequenceSer-Val-Arg-Met-Arg-Tyr-Gly-Ile-Asp-Ala-Phe-Phe-Asp-Leu-Gly-Gly-Leu-Leu-His-Gly(SEQ ID NO: 9), preferably conformationally-constrained; a proteinincluding the sequenceGlu-Leu-Arg-His-Arg-Leu-Gly-Arg-Ala-Leu-Ser-Glu-Asp-Met-Val-Arg-Gly-Leu-Ala-Trp-Gly-Pro-Thr-Ser-His-Cys-Ala-Thr-Val-Pro-Gly-Thr-Ser-Asp-Leu-Trp-Arg-Val-Ile-Arg-Phe-Leu(SEQ ID NO: 10), preferably conformationally-constrained; a proteinincluding the sequence Tyr-Ser-Phe-Val-His-His-Gly-Phe-Phe-Asn-Phe-Arg-Val-Ser-Trp-Arg-Glu-Met-Leu-Ala (SEQ ID NO: 11),preferably conformationally-constrained; a protein including thesequenceGln-Val-Trp-Ser-Leu-Trp-Ala-Leu-Gly-Trp-Arg-Trp-Leu-Arg-Arg-Tyr-Gly-Trp-Asn-Met (SEQ ID NO: 12), preferably conformationally-constrained; aprotein including the sequenceTrp-Arg-Arg-Met-Glu-Leu-Asp-Ala-Glu-Ile-Arg-Trp-Val-Lys-Pro-Ile-Ser-Pro-Leu-Glu (SEQ ID NO: 13), preferablyconformationally-constrained; a protein including the sequenceTrp-Ala-Glu-Trp-Cys-Gly-Pro-Val-Cys-Ala-His-Gly-Ser-Arg-Ser-Leu-Thr-Leu-Leu-Thr-Lys-Tyr-His-Val-Ser-Phe-Leu-Gly-Pro-Cys-Lys-Met-Ile-Ala-Pro-Ile-Leu-Asp (SEQ ID NO:17), preferablyconformationally-constrained; a protein including the sequenceLeu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Leu-Phe(SEQ ID NO: 18), preferably conformationally-constrained; a proteinincluding the sequenceTyr-Arg-Trp-Gln-Gln-Gly-Val-Val-Pro-Ser-Asn-Trp-Ala-Ser-Cys-Ser-Phe-Arg-Cys-Gly(SEQ ID NO: 19), preferably conformationally-constrained; a proteinincluding the sequenceSer-Ser-Phe-Ser-Leu-Trp-Leu-Leu-Met-Val-Lys-Ser-Ile-Lys-Arg-Ala-Ala-Trp-Glu-Leu-Gly-Pro-Ser-Ser-Ala-Trp-Asn-Thr-Ser-Gly-Trp-Ala-Ser-Leu-Ala-Asp-Phe-Tyr(SEQ ID NO: 20) preferably conformationally-constrained; a proteinincluding the sequenceArg-Val-Lys-Leu-Gly-Tyr-Ser-Phe-Trp-Ala-Gln-Ser-Leu-Leu-Arg-Cys-Ile-Ser-Val-Gly(SEQ ID NO: 21), preferably conformationally-constrained; a proteinincluding the sequenceGln-Leu-Tyr-Ala-Gly-Cys-Tyr-Leu-Gly-Val-Val-Ile-Ala-Ser-Ser-Leu-Ser-Ile-Arg-Val(SEQ ID NO: 22), preferably conformationally-constrained; a proteinincluding the sequenceGln-Gln-Arg-Phe-Val-Phe-Ser-Pro-Ser-Trp-Phe-Thr-Cys-Ala-Gly-Thr-Ser-Asp-Phe-Trp-Gly-Pro-Glu-Pro-Leu-Phe-Asp-Trp-Thr-Arg-Asp (SEQ ID NO: 23), preferablyconformationally-constrained; a protein including the sequenceArg-Pro-Leu-Thr-Gly-Arg-Trp-Val-Val-Trp-Gly-Arg-Arg-His-Glu-Glu-Cys-Gly-Leu-Thr(SEQ ID NO: 24), preferably conformationally-constrained; a proteinincluding the sequencePro-Val-Cys-Cys-Met-Met-Tyr-Gly-His-Arg-Thr-Ala-Pro-His-Ser-Val-Phe-Asn-Val-Asp(SEQ ID NO: 25), preferably conformationally-constrained; a proteinincluding the sequenceTrp-Ser-Pro-Glu-Leu-Leu-Arg-Ala-Met-Val-Ala-Phe-Arg-Trp-Leu-Leu-Glu-Arg-Arg-Pro(SEQ ID NO: 26); and substantially pure DNA encoding the immediatelyforegoing proteins.

[0022] The invention also includes novel proteins and other candidateinteractors identified by the foregoing methods. It will be appreciatedthat these proteins and candidate interactors may either increase ordecrease reporter gene activity and that these changes in activity maybe measured using assays described herein or known in the art. Alsoincluded in the invention are methods for using conformationallyconstrained interactor proteins. For example, the conformationallyconstrained proteins of the invention may be used as reagents in assaysfor protein detection that involve formation of a complex between theconformationally constrained protein and a protein of interest to whichit specifically binds, followed by complex detection (for example, by animmunoprecipitation, Western blot, or affinity column technique thatutilizes the conformationally constrained protein as the complex-formingreagent).

[0023] Finally, the invention features a method of assaying aninteraction between a first protein and a second protein, involving: (a)providing the first protein; (b) providing a fusion protein includingthe second protein, the second protein beingconformationally-constrained; (c) contacting the first protein with thefusion protein under conditions which allow complex formation; (d)detecting the complex as an indication of an interaction; and (e)determining whether the first protein interacts with the fusion proteininside a cell.

[0024] As used herein, by “reporter gene” is meant a gene whoseexpression may be assayed; such genes include, without limitation, lacZ,amino acid biosynthetic genes, e.g. the yeast LEU2, HIS3, LYS2, TRP1, orURA3 genes, nucleic acid biosynthetic genes, the mammalianchloramphenicol transacetylase (CAT) gene, or any surface antigen genefor which specific antibodies are available. Reporter genes may encodeany protein that provides a phenotypic marker, for example, a proteinthat is necessary for cell growth or a toxic protein leading to celldeath, or may encode a protein detectable by a color assay leading tothe presence or absence of color (e.g., florescent proteins andderivatives thereof). Alternatively, a reporter gene may encode asuppressor tRNA, the expression of which produces a phenotype that canbe assayed. A reporter gene according to the invention includes elements(e.g., all promoter elements) necessary for reporter gene function.

[0025] By “operably linked” is meant that a gene and a regulatorysequence(s) are connected in such a way as to permit gene expressionwhen the appropriate molecules (e.g., transcriptional activator proteinsor proteins which include transcriptional activation domains) are boundto the regulatory sequence(s).

[0026] By “covalently bonded” is meant that two domains are joined bycovalent bonds, directly or indirectly. That is, the “covalently bonded”proteins or protein moieties may be immediately contiguous or may beseparated by stretches of one or more amino acids within the same fusionprotein.

[0027] By “providing” is meant introducing the fusion proteins into theinteraction system sequentially or simultaneously, and directly (asproteins) or indirectly (as genes encoding those proteins).

[0028] By “protein” is meant a sequence of amino acids of any length,constituting all or a part of a naturally-occurring polypeptide orpeptide, or constituting a non-naturally-occurring polypeptide orpeptide (e.g., a randomly generated peptide sequence or one of anintentionally designed collection of peptide sequences).

[0029] By a “binding moiety” is meant a stretch of amino acids which iscapable of directing specific polypeptide binding to a particular DNAsequence (i.e., a “DNA-binding-protein recognition site”).

[0030] By “weak gene activating moiety” is meant a stretch of aminoacids which is capable of weakly inducing the expression of a gene towhose control region it is bound. As used herein, “weakly” is meantbelow the level of activation effected by GAL4 activation region II (Maand Ptashne, Cell 48:847, 1987) and is preferably at or below the levelof activation effected by the B112 activation domain of Ma and Ptashne(Cell 51:113, 1987). Levels of activation may be measured using anydownstream reporter gene system and comparing, in parallel assays, thelevel of expression stimulated by the GAL4 region II-polypeptide withthe level of expression stimulated by the polypeptide to be tested.

[0031] By “altering the expression of the reporter gene” is meant anincrease or decrease in the expression of the reporter gene to theextent required for detection of a change in the assay being employed.It will be appreciated that the degree of change will vary dependingupon the type of reporter gene construct or reporter gene expressionassay being employed.

[0032] By “conformationally-constrained” is meant a protein that hasreduced structural flexibility because its amino and carboxy termini arefixed in space. As a result of this constraint, the protein may form“loops” (i.e., regions of amino acids of any shape which extend awayfrom the constrained amino and carboxy termini). Preferably, theconformationally-constrained protein is displayed in a structurallyrigid manner. Conformational constraint according to the invention maybe brought about by exploiting the disulfide-bonding ability of anatural or recombinantly-introduced pair of cysteine residues, oneresiding at or near the amino-terminal end of the protein of interestand the other at or near the carboxy-terminal end. Alternatively,conformational constraint may be facilitated by embedding the protein ofinterest within a conformation-constraining protein.

[0033] By “conformation-constraining protein” is meant any peptide orpolypeptide which is capable of reducing the flexibility of anotherprotein's amino and/or carboxy termini. Preferably, such proteinsprovide a rigid scaffold or platform for the protein of interest. Inaddition, such proteins preferably are capable of providing protectionfrom proteolytic degradation and the like, and/or are capable ofenhancing solubility. Examples of conformation-constraining proteinsinclude thioredoxin and other thioredoxin-like proteins, nucleases(e.g., RNase A), proteases (e.g., trypsin), protease inhibitors (e.g.,bovine pancreatic trypsin inhibitor), antibodies or structurally-rigidfragments thereof, conotoxins, and the pleckstrin homology domain. Aconformation-constraining peptide can be of any appropriate length andcan even be a single amino acid residue.

[0034] “Thioredoxin-like proteins” are defined herein as amino acidsequences substantially similar, e.g., having at least 18% homology,with the amino acid sequence of E. coli thioredoxin over an amino acidsequence length of 80 amino acids. Alternatively, a thioredoxin-like DNAsequence is defined herein as a DNA sequence encoding a protein orfragment of a protein characterized by having a three dimensionalstructure substantially similar to that of human or E. coli thioredoxin,e.g., glutaredoxin and optionally by containing an active-site loop. TheDNA sequence of glutaredoxin is an example of a thioredoxin-like DNAsequence which encodes a protein that exhibits such substantialsimilarity in three-dimensional conformation and contains a Cys . . .Cys active-site loop. The amino acid sequence of E. coli thioredoxin isdescribed in Eklund et al., EMBO J. 3:1443-1449 (1984). Thethree-dimensional structure of E. coli thioredoxin is depicted in FIG. 2of Holmgren, J. Biol. Chem. 264:13963-13966 (1989). A DNA sequenceencoding the E. coli thioredoxin protein is set forth in Lim et al., J.Bacteriol., 163:311-316 (1985). The three dimensional structure of humanthioredoxin is described in Forman-Kay et al., Biochemistry 30:2685-98(1991). A comparison of the three dimensional structures of E. colithioredoxin and glutaredoxin is published in Xia, Protein ScienceI:310-321 (1992). These four publications are incorporated herein byreference for the purpose of providing information on thioredoxin-likeproteins that is known to one of skill in the art. Examples ofthioredoxin-like proteins are described herein.

[0035] By “candidate interactors” is meant proteins (“candidateinteracting proteins”) or compounds which physically interact with aprotein of interest; this term also encompasses agonists andantagonists. Agonist interactors are identified as compounds or proteinsthat have the ability to increase reporter gene expression mediated by apair of interacting proteins. Antagonist interactors are identified ascompounds or proteins that have the ability to decrease reporter geneexpression mediated by a pair of interacting proteins. Candidateinteractors also include so-called peptide “aptamers” which specificallyrecognize target proteins and may be used in a manner analogous toantibody reagents; such aptamers may include one or more loops.

[0036] “Compounds” include small molecules, generally under 1000 MW,carbohydrates, polynucleotides, lipids, and the like.

[0037] By “test protein” is meant one of a pair of interacting proteins,the other member of the pair generally referred to as a “candidateinteractor” (supra).

[0038] By “randomly generated” is meant sequences having nopredetermined sequence; this is contrasted with “intentionally designed”sequences which have a DNA or protein sequence or motif determined priorto their synthesis.

[0039] By “mutated” is meant altered in sequence, either bysite-directed or random mutagenesis. A mutated form of a proteinencompasses point mutations as well as insertions, deletions, orrearrangements.

[0040] By “intracellular” is meant that the peptide is localized insidethe cell, rather than on the cell surface.

[0041] By an “activated Ras” is meant any mutated form of Ras whichremains bound to GTP for a period of time longer than that exhibited bythe corresponding wild-type form of the protein. By “Ras” is meant anyform of Ras protein including, without limitation, N-ras, K-ras, andH-ras.

[0042] The interaction trap systems described herein provide advantagesover more conventional methods for isolating interacting proteins orgenes encoding interacting proteins. For example, applicants' systemsprovide rapid and inexpensive methods having very general utility foridentifying and purifying genes encoding a wide range of useful proteinsbased on the protein's physical interaction with a second polypeptide.This general utility derives in part from the fact that the componentsof the systems can be readily modified to facilitate detection ofprotein interactions of widely varying affinity (e.g., by using reportergenes which differ quantitatively in their sensitivity to a proteininteraction). The inducible nature of the promoter used to express theinteracting proteins also increases the scope of candidate interactorswhich may be detected since even proteins whose chronic expression istoxic to the host cell may be isolated simply by inducing a short burstof the protein's expression and testing for its ability to interact andstimulate expression of a reporter gene.

[0043] If desired, detection of interacting proteins may be accomplishedthrough the use of weak gene activation domain tags. This approachavoids restrictions on the pool of available candidate interactingproteins which may be associated with stronger activation domains (suchas GAL4 or VP16); although the mechanism is unclear, such a restrictionapparently results from low to moderate levels of host cell toxicitymediated by the strong activation domain.

[0044] In addition, the claimed methods make use ofconformationally-constrained proteins (i.e., proteins with reducedflexibility due to constraints at their amino and carboxy termini).Conformational constraint may be brought about by embedding the proteinof interest within a conformation-constraining protein (i.e., a proteinof appropriate length and amino acid composition to be capable oflocking the candidate interacting protein into a particularthree-dimensional structure). Examples of conformation-constrainingproteins include, but are not limited to, thioredoxin (or otherthioredoxin-like proteins), nucleases (e.g., RNase A), proteases (e.g.,trypsin), protease inhibitors (e.g., bovine pancreatic trypsininhibitor), antibodies or structurally-rigid fragments thereof,conotoxins, and the pleckstrin homology domain.

[0045] Alternatively, conformational constraint may be accomplished byexploiting the disulfide-bonding ability of a natural orrecombinantly-introduced pair of cysteine residues, one residing at theamino terminus of the protein of interest and the other at its carboxyterminus. Such disulfide bonding locks the protein into a rigid andtherefore conformationally-constrained loop structure. Disulfide bondsbetween amino-terminal and carboxy-terminal cysteines may be formed, forexample, in the cytoplasm of E. coli trxB mutant strains. Under someconditions disulfide bonds may also form within the cytoplasm andnucleus of higher organisms harboring equivalent mutations, for example,an S. cerevisiae YTR4⁻ mutant strain (Furter et al., Nucl Acids Res.14:6357-6373, 1986; GenBank Accession Number P29509). In addition, thethioredoxin fusions described herein (trxA fusions) are amenable to thisalternative means of introducing conformational constraint, since thecysteines at the base of peptides inserted within the thioredoxinactive-site loop are at a proper distance from one another to formdisulfide bonds under appropriate conditions.

[0046] Conformationally-constrained proteins as candidate interactorsare useful in the invention because they are amenable to tertiarystructural analysis, thus facilitating the design of simple organicmolecule mimetics with improved pharmacological properties. For example,because thioredoxin has a known structure, the protein structure betweenthe conformationally constrained regions may be more easily solved usingmethods such as NMR and X-ray difference analysis. Certainconformation-constraining proteins also protect the embedded proteinfrom cellular degradation and/or increase the protein's solubility,and/or otherwise alter the capacity of the candidate interactor tointeract.

[0047] Once isolated, interacting proteins can also be analyzed usingthe interaction trap system, with the signal generated by theinteraction being an indication of any change in the proteins'interaction capabilities. In one particular example, an alteration ismade (e.g., by standard in vivo or in vitro directed or randommutagenesis procedures) to one or both of the interacting proteins, andthe effect of the alteration(s) is monitored by measuring reporter geneexpression. Using this technique, interacting proteins with increased ordecreased interaction potential are isolated. Such proteins are usefulas therapeutic molecules (for example, agonists or antagonists) or, asdescribed above, as models for the design of simple organic moleculemimetics.

[0048] Protein agonists and antagonists may also be readily identifiedand isolated using a variation of the interaction trap system. Inparticular, once a protein-protein interaction has been recorded, anadditional DNA coding for a candidate agonist or antagonist, orpreferably, one of a library of potential agonist- orantagonist-encoding sequences is introduced into the host cell, andreporter gene expression is measured. Alternatively, candidateinteractor agonist or antagonist compounds (i.e., including polypeptidesas well as non-proteinaceous compounds, e.g., single strandedpolynucleotides) are introduced into an in vivo or in vitro interactiontrap system according to the invention and their ability to effectreporter gene expression is measured. A decrease in reporter geneexpression (compared to a control lacking the candidate sequence orcompound) indicates an antagonist. Conversely, an increase in reportergene expression (compared again to a control) indicates an agonist.Interaction agonists and antagonists are useful as therapeutic agents oras models to design simple mimetics; if desired, an agonist orantagonist protein may be conformationally-constrained to provide theadvantages described herein. Particular examples of interacting proteinsfor which antagonists or agonists may be identified include, but are notlimited to, the IL-6 receptor-ligand pair, TGF-β receptor-ligand pair,IL-1 receptor-ligand pair and other receptor-ligand interactions,protein kinase-substrate pairs, interacting pairs of transcriptionfactors, interacting components of signal transduction pathways (forexample, cytoplasmic domains of certain receptors and G-proteins), pairsof interacting proteins involved in cell cycle regulation (for example,p16 and CDK4), and neurotransmitter pairs.

[0049] Also included in the present invention are libraries encodingconformationally-constrained proteins. Such libraries (which may includenatural as well as synthetic DNA sequence collections) are expressedintracellularly or, optionally, in cell-free systems, and may be usedtogether with any standard genetic selection or screen or with any of anumber of interaction trap formats for the identification of interactingproteins, agonist or antagonist proteins, or proteins that endow a cellwith any identifiable characteristic, for example, proteins that perturbcell cycle progression. Accordingly, peptide-encoding libraries (eitherrandom or designed) can be used in selections or screens which eitherare or are not transcriptionally-based. These libraries (whichpreferably include at least 100 different peptide-encoding species andmore preferably include 1000, or 100,000 or greater individual species)may be transformed into any useful prokaryotic or eukaryotic host, withyeast representing the preferred host. Alternatively, suchpeptide-encoding libraries may be expressed in cell-free systems.

[0050] Other features and advantages of the invention will be apparentfrom the following detailed description thereof, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0051] The drawings are first briefly described.

[0052] FIGS. 1A-1C illustrate one interaction trap system according tothe invention.

[0053]FIG. 2 is a diagram of a library vector pJM1.

[0054]FIG. 3A is a photograph showing the interaction of peptideaptamers with other proteins.

[0055]FIG. 3B illustrates the sequence of exemplary Cdk2 interactingpeptides.

[0056]FIG. 4A is a photograph showing the interaction of peptideaptamers with other proteins. The designations of these peptide aptamersdiffer from those shown in FIG. 3A, and corresponds to the numberingshown in FIG. 4B. To carry out these experiments, yeast strain EGY48 wastransformed with either a plasmid expressing an anti-Cdk2 aptamer orwith a plasmid expressing a control 20-mer peptide loop, and the strainwas then mated to different bait strains as described in Finley et al.(Proc. Natl. Acad. Sci. U.S.A. 91:12980-12984 (1994)).

[0057]FIG. 4B illustrates the sequence of the exemplary Cdk2 interactingpeptide aptamers assayed in FIG. 4A.

[0058]FIG. 5 illustrates coprecipitation of peptides 3 and 13 byGst-Cdk2. Lane 1. Gst Beads, extract contains TrxA;

[0059] Lane 2. Gst Beads, extract contains TrxA-peptide 3;

[0060] Lane 3. Gst Beads, extract contain TrxA-peptide 13;

[0061] Lane 4. Gst-Cdk2 beads, extract contains TrxA;

[0062] Lane 5. Gst-Cdk2 beads, extract contains TrxA-peptide 3; and

[0063] Lane 6. Gst-Cdk2, extract contains TrxA-peptide 13.

[0064]FIG. 6 illustrates coprecipitation of the peptide aptamers ofFIGS. 4A and 4B.

[0065]FIG. 7 illustrates a representative binding affinity graphproduced using an evanescent wave instrument.

[0066]FIG. 8 illustrates the ability of exemplary peptide aptamers ofFIGS. 4A and 4B to inhibit phosphorylation of Histone H1 by Cdk2/cyclinE kinase.

[0067]FIG. 9 illustrates the vector BRM1E6-H-Ras(G12V).

[0068]FIG. 10 illustrates the vector pEG202-H-Ras(G12V).

DETAILED DESCRIPTION

[0069] Applicants have developed a novel interaction trap system for theidentification and analysis of conforRationally-constrained proteinsthat either physically interact with a second protein of interest orthat antagonize or agonize such an interaction. In one embodiment, thesystem involves a eukaryotic host strain (e.g., a yeast strain) which isengineered to produce a protein of therapeutic or diagnostic interest asa fusion protein covalently bonded to a known DNA binding domain; thisprotein is referred to as a “bait” protein because its purpose in thesystem is to “catch” useful,, but as yet unknown or uncharacterized,interacting polypeptides (termed the “prey” ; see below). The eukaryotichost strain also contains one or more “reporter genes,” i.e., geneswhose transcription is detected in response to a bait-prey interaction.Bait proteins, via their DNA binding domain, bind to their specific DNArecognition site upstream of a reporter gene; reporter transcription isnot stimulated, however, because the bait protein lacks an activationdomain.

[0070] To isolate DNA sequences encoding novel interacting proteins,members of a DNA expression library (e.g., a cDNA or synthetic DNAlibrary, either random or intentionally biased) are introduced into thestrain containing the reporter gene and bait protein; each member of thelibrary directs the synthesis of a candidate interacting protein fusedto an invariant gene activation domain tag. Those library-encodedproteins that physically interact with the promoter-bound bait proteinare referred to as “prey” proteins. Such bound prey proteins (via theiractivation domain tag) detectably activate the expression of thedownstream reporter gene and provide a ready assay for identifying aparticular DNA clone encoding an interacting protein of interest. In theinstant invention, each candidate prey protein isconformationally-constrained (for example, either by embedding theprotein within a conformation-constraining protein or by linkingtogether the protein's amino and carboxy termini). Such a protein ismaintained in a fixed, three-dimensional structure, facilitating mimeticdrug design.

[0071] An example of one interaction trap system according to theinvention is shown in FIGS. 1A-C. FIG. 1A shows a leucine auxotrophyeast strain containing two reporter genes, LexAop-LEU2 and LexAop-lacZ,and a constitutively expressed bait protein gene. The bait protein(shown as a pentagon) is fused to a DNA binding domain (shown as acircle). The DNA binding protein recognizes and binds a specificDNA-binding-protein recognition site (shown as a solid rectangle)operably-linked to a reporter gene. In FIGS. 1B and 1C, the cellsadditionally contain candidate prey proteins (candidate interactors)(shown as an empty rectangle in 1B and an empty hexagon in 1C) fused toan activation domain (shown as a solid square); each prey protein isembedded in a conformation-constraining protein (shown as two solid halfcircles). FIG. 1B shows that if the candidate prey protein does notinteract with the transcriptionally-inert LexA-fusion bait protein, thereporter genes are not transcribed; the cell cannot grow into a colonyon leu medium, and it is white on Xgal medium because it contains noβ-galactosidase activity. FIG. 1C shows that, if the candidate preyprotein interacts with the bait, both reporter genes are active; thecell forms a colony on leu- medium, and cells in that colony haveβ-galactosidase activity and are blue on Xgal medium. Preferably, inthis system, the bait protein (i.e., the protein containing asite-specific DNA binding domain) is transcriptionally inert, and thereporter genes (which are bound by the bait protein) have essentially nobasal transcription.

[0072] Each component of the system is now described in more detail.

Bait Proteins

[0073] The selection host strain depicted in FIGS. 1A-C contains a DNAencoding a bait protein fused to a DNA encoding a DNA binding moietyderived from the bacterial LexA protein. The use of a LexA DNA bindingdomain provides certain advantages. For example, in yeast,, the LexAmoiety contains no activation function and has no known effect ontranscription of yeast genes (Brent and Ptashne, Nature 312:612-615,1984; Brent and Ptashne, Cell 43:729-736, 1985). In addition, use of theLexA rather than, for example, the GAL4 DNA-binding domain allowsconditional expression of prey proteins in response to galactoseinduction; this facilitates detection of prey proteins that might betoxic to the host cell if expressed continuously. Finally, the use of awell-defined system, such as LexA, allows knowledge regarding theinteraction between LexA and the LexA binding site (i.e., the LexAoperator) to be exploited for the purpose of optimizing operatoroccupancy and/or optimizing the geometry of the bound bait protein toeffect maximal gene activation.

[0074] Preferably, the bait protein also includes a LexA dimerizationdomain; this optional domain facilitates efficient LexA dimer formation.Because LexA binds its DNA binding site as a dimer, inclusion of thisdomain in the bait protein also optimizes the efficiency of operatoroccupancy (Golemis and Brent, Mol. Cell Biol. 12:3006-3014, (1992)).

[0075] LexA represents a preferred DNA binding domain in the invention.However, any other transcriptionally-inert or essentiallytranscriptionally-inert DNA binding domain may be used in theinteraction trap system; such DNA binding domains are well known andinclude the DNA binding portions of the proteins ACE1 (CUPI), lambda cI,lac repressor, jun, fos, GCN4, or the Tet repressor. The GAL4 DNAbinding domain represents a slightly less preferred DNA binding moietyfor the bait proteins.

[0076] Bait proteins may be chosen from any protein of interest andincludes proteins of unknown, known, or suspected diagnostic,therapeutic, or pharmacological importance. Preferred bait proteinsinclude oncoproteins (such as myc, particularly the C-terminus of myc,ras, src, fos, and particularly the oligomeric interaction domains offos) or any other proteins involved in cell cycle regulation (such askinases, phosphatases, the cytoplasmic portions of membrane-associatedreceptors). Particular examples of preferred bait proteins includecyclin and cyclin dependent kinases (for example, Cdk2) orreceptor-ligand pairs, or neurotransmitter pairs, or pairs of othersignalling proteins. In each case, the protein of interest is fused to aknown DNA binding domain as generally described herein. Examples areprovided below using Cdk2 and Ras baits.

Reporters

[0077] As shown in FIG. 1B, one preferred host strain according to theinvention contains two different reporter genes, the LEU2 gene and thelacz gene, each carrying an upstream binding site for the bait protein.The reporter genes depicted in FIG. 1B each include, as an upstreambinding site, one or more LexA operators in place of their nativeUpstream Activation Sequences (UASs). These reporter genes may beintegrated into the chromosome or may be carried on autonomouslyreplicating plasmids (e.g., yeast 2μ plasmids).

[0078] A combination of two such reporters is preferred in the in vivoembodiments of the invention for a number of reasons. First, theLexAop-LEU2 construction allows cells that contain interacting proteinsto select themselves by growth on medium that lacks leucine,facilitating the examination of large numbers of potential candidateinteractor protein-containing cells. Second, the LexAop-lacZ reporterallows LEU⁺ cells to be quickly screened to confirm an interaction. And,third, among other technical considerations, the LexAop-LEU2 reporterprovides an extremely sensitive first selection, while the LexAop-lacZreporter allows discrimination between proteins of different interactionaffinities.

[0079] Although the reporter genes described herein represent apreferred embodiment of the invention, other equivalent genes whoseexpression may be detected or assayed by standard techniques may also beemployed in conjunction with, or instead of, the LEU2 and lacZ genes.Generally, such reporter genes encode an enzyme that provides aphenotypic marker, for example, a protein that is necessary for cellgrowth or a toxic protein leading to cell death, or encoding a proteindetectable by a color assay or because its expression leads to thepresence or absence of color. Alternatively, the reporter gene mayencode a suppressor tRNA whose expression may be assayed, for example,because it suppresses a lethal host cell mutation. Particular examplesof other useful genes whose transcription can be detected include aminoacid and nucleic acid biosynthetic genes (such as yeast HIS3, URA3,TRP1, and LYS2) GAL1, E. coli galk (which complements the yeast GALlgene), and the reporter genes CAT, GUS, florescent proteins andderivatives thereof, and any gene encoding a cell surface antigen forwhich antibodies are available (e.g., CD4). Reporter genes may beassayed by either qualitative or quantitative means to distinguishcandidate interactors as agonists or antagonists.

Prey proteins

[0080] In the selection described herein, another DNA construction isutilized which encodes a series of candidate interacting proteins (i.e.,prey proteins); each is conformationally-constrained, either by beingembedded in a conformation-constraining protein or because the preyprotein's amino and carboxy termini are linked (e.g., by disulfidebonding). An exemplary prey protein includes an invariant N-terminalmoiety carrying, amino to carboxy terminal, an ATG for proteinexpression, an optional nuclear localization sequence, a weak activationdomain (e.g., the B112 or B42 activation domains of Ma and Ptashne; Cell51:113, 1987), and an optional epitope tag for rapid immunologicaldetection of fusion protein synthesis. Library sequences, random orintentionally designed synthetic DNA sequences, or sequences encodingconformationally-constrained proteins, may be inserted downstream ofthis N-terminal fragment to produce fusion genes encoding prey proteins.

[0081] Prey proteins other than those described herein are also usefulin the invention. For example, cDNAs may be constructed from any mRNApopulation and inserted into an equivalent expression vector. Such alibrary of choice may be constructed de novo using commerciallyavailable kits (e.g., from Stratagene, La Jolla, Calif.) or using wellestablished preparative procedures (see, e.g., Current Protocols inMolecular Biology, New York, John Wiley & Sons, 1987). Alternatively, anumber of CDNA libraries (from a number of different organisms) arepublicly and commercially available; sources of libraries include, e.g.,Clontech (Palo Alto, Calif.) and Stratagene (La Jolla, Calif.). It isalso noted that prey proteins need not be naturally occurringfull-length polypeptides. In preferred embodiments, prey proteins areencoded by synthetic DNA sequences, are the products of randomlygenerated open reading frames, are open reading frames synthesized withan intentional sequence bias, or are portions thereof. Preferably, suchshort randomly generated sequences encode peptides between 1 (andpreferably, 6) and 60 amino acids in length. In one particular example,the prey protein includes only an interaction domain; such a domain maybe useful as a therapeutic to modulate bait protein activity (i.e., asan antagonist or agonist). In another particular example, the preyprotein contains one or more loops. Such a prey protein may be used asan immunological reagent for diagnostic purposes or for any of thetherapeutic purposes described herein; in this context, the differentloops may recognize different portions of the bait protein and mayincrease specificity. In addition, a prey protein may be a combinationof multiple interacting peptides connected in reading frame (if desired,with alternating conformation-constraining sequences) to provide afurther optimized prey protein. In one example, each of theseinteracting peptides constitutes one loop of the final interactingprotein.

[0082] Similarly, any number of activation domains may be used for thatportion of the prey molecule; such activation domains are preferablyweak activation domains, i.e., weaker than the GAL4 activation region IImoiety and preferably no stronger than B112 (as measured, e.g., by acomparison with GAL4 activation region II or B112 in parallelβ-galactosidase assays using lacz reporter genes); such a domain may,however, be weaker than B112. In particular, the extraordinarysensitivity of the LEU2 selection scheme allows even extremely weakactivation domains to be utilized in the invention. Examples of otheruseful weak activation domains include B17, B42, and the amphipathichelix (AH) domains described in Ma and Ptashne (Cell 51:113, 1987),Ruden et al. (Nature 350:426-430, 1991), and Giniger and Ptashne (Nature330:670, 1987).

[0083] The prey proteins, if desired, may include other optional nuclearlocalization sequences (e.g., those derived from the GAL4 or MATa2genes) or other optional epitope tags (e.g., portions of the c-mycprotein or the flag epitope available from Immunex). These sequencesoptimize the efficiency of the system, but are not required for itsoperation. In particular, the nuclear localization sequence optimizesthe efficiency with which prey molecules reach the nuclear-localizedreporter gene construct(s), thus increasing their effectiveconcentration and allowing one to detect weaker protein interactions.The epitope tag merely facilitates a simple immunoassay for fusionprotein expression.

[0084] Those skilled in the art will also recognize that theabove-described reporter gene, DNA binding domain, and gene activationdomain components may be derived from any appropriate eukaryotic orprokaryotic source, including yeast, mammalian cell, and prokaryoticcell genomes or cDNAs as well as artificial sequences. Moreover,although yeast represents a preferred host organism for the interactiontrap system (for reasons of ease of propagation, genetic manipulation,and large scale screening), other host organisms such as mammalian cellsmay also be utilized. If a mammalian system is chosen, a preferredreporter gene is the sensitive and easily assayed CAT gene; useful DNAbinding domains and gene activation domains may be chosen from thosedescribed above (e.g., the LexA DNA binding domain and the B42 or B112activation domains).

Conformation-Constraining Proteins

[0085] According to one embodiment of the present invention, the DNAsequence encoding the prey protein is embedded in a DNA sequenceencoding a conformation-constraining protein (i.e., a protein thatdecreases the flexibility of the amino and carboxy termini of the preyprotein). Methods for directly linking the amino and carboxy termini ofa protein (e.g., through disulfide bonding of appropriately positionedcysteine residues) are described above. As an alternative to thisapproach, conformation-constraining proteins may be utilized. Ingeneral, conformation-constraining proteins act as scaffolds orplatforms, which limit the number of possible three dimensionalconfigurations the peptide or protein of interest is free to adopt.Preferred examples of conformation-constraining proteins are thioredoxinor other thioredoxin-like sequences, but many other proteins are alsouseful for this purpose. Preferably, conformation-constraining proteinsare small in size (generally, less than or equal to 200 amino acids),rigid in structure, of known three dimensional configuration, and areable to accommodate insertions of proteins of interest without unduedisruption of their structures. A key feature of such proteins is theavailability, on their solvent exposed surfaces, of locations wherepeptide insertions can be made (e.g., the thioredoxin active-site loop).It is also preferable that conformation-constraining protein producinggenes be highly expressible in various prokaryotic and eukaryotic hosts,or in suitable cell-free systems, and that the proteins be soluble andresistant to protease degradation. Examples of conformation-constrainingproteins useful in the invention include nucleases (e.g., RNase A),proteases (e.g., trypsin), protease inhibitors (e.g., bovine pancreatictrypsin inhibitor), antibodies or rigid fragments thereof, conotoxins,and the pleckstrin homology domain. This list, however, is not limiting.It is expected that other conformation-constraining proteins havingsequences not identified above, or perhaps not yet identified orpublished, may be useful based upon their structural stability andrigidity.

[0086] As mentioned above, one preferred conformation-constrainingprotein according to the invention is thioredoxin or otherthioredoxin-like proteins. As one example of a thioredoxin-like proteinuseful in this invention, E. coli thioredoxin has the followingcharacteristics. E. coli thioredoxin is a small protein, only 11.7 kD,and can be produced to high levels. The small size and capacity for highlevel synthesis of the protein contributes to a high intracellularconcentration. E. coli thioredoxin is further characterized by a verystable, tight tertiary structure which can facilitate proteinpurification.

[0087] The three dimensional structure of E. coli thioredoxin is knownand contains several surface loops, including a distinctive Cys . . .Cys active-site loop between residues Cys₃₃ and Cys₃₆ which protrudesfrom the body of the protein. This Cys . . . Cys active-site loop is anidentifiable, accessible surface loop region and is not involved ininteractions with the rest of the protein which contribute to overallstructural stability. It is therefore a good candidate as a site forprey protein insertions. Human thioredoxin, glutaredoxin, and otherthioredoxin-like molecules also contain this Cys. . . Cys active-siteloop. Both the amino- and carboxyl-termini of E. coli thioredoxin are onthe surface of the protein and are also readily accessible for fusionconstruction. E. coli thioredoxin is also stable to proteases, stable inheat up to 80° C. and stable to low pH.

[0088] Other thioredoxin-like proteins encoded by thioredoxin-like DNAsequences useful in this invention share homologous amino acidsequences, and similar physical and structural characteristics. Thus,DNA sequences encoding other thioredoxin-like proteins may be used inplace of E. coli thioredoxin according to this invention. For example,the DNA sequence encoding other species' thioredoxin, e.g., humanthioredoxin, are suitable. Human thioredoxin has a three-dimensionalstructure that is virtually superimposable on E. coli 'sthree-dimensional structure, as determined by comparing the NMRstructures of the two molecules. Forman-Kay et al., Biochem. 30:2685(1991). Human thioredoxin also contains an active-site loop structurallyand functionally equivalent to the Cys. . . Cys active-site loop foundin the E. coli protein. It can be used in place of or in addition to E.coli thioredoxin in the production of protein and small peptides inaccordance with the method of this invention. Insertions into the humanthioredoxin active-site loop and onto the amino terminus may be aswell-tolerated as those in E. coli thioredoxin.

[0089] Other thioredoxin-like sequences which may be employed in thisinvention include all or portions of the proteins glutaredoxin andvarious species' homologs thereof (Holmgren, supra). Although E. coliglutaredoxin and E. coli thioredoxin share less than 20% amino acidhomology, the two proteins do have conformational and functionalsimilarities (Eklund et al., EMBO J. 3:1443-1449 (1984)) andglutaredoxin contains an active-site loop structurally and functionallyequivalent to the Cys . . . cys active-site loop of E. coli thioredoxin.Glutaredoxin is therefore a thioredoxin-like molecule as defined herein.

[0090] In addition, the DNA sequence encoding protein disulfideisomerase (PDI), or that portion containing the thioredoxin-like domain,and its various species' homologs thereof (Edman et al., Nature317:267-270 (1985)) may also be employed as a thioredoxin-like DNAsequence, since a repeated domain of PDI shares >30% homology with E.coli thioredoxin and that repeated domain contains an active-site loopstructurally and functionally equivalent to the Cys . . . Cysactive-site loop of E. coli thioredoxin. The two latter publications areincorporated herein by reference for the purpose of providinginformation on glutaredoxin and PDI which is known and available to oneof skill in the art.

[0091] Similarly the DNA sequence encoding phosphoinositide-specificphospholipase C (PI-PLC), fragments thereof, and various species'homologs thereof (Bennett et al., Nature, 334:268-270 (1988)) may alsobe employed in the present invention as a thioredoxin-like sequencebased on the amino acid sequence homology with E. coli thioredoxin, oralternatively based on similarity in three dimensional conformation andthe presence of an active-site loop structurally and functionallyequivalent to Cys . . . Cys active-site loop of E. coli thioredoxin. Allor a portion of the DNA sequence encoding an endoplasmic reticulumprotein, ERp72, or various species homologs thereof are also included asthioredoxin-like DNA sequences for the purposes of this invention(Mazzarella et al., J. Biol. Chem. 265:1094-1101 (1990)) based on aminoacid sequence homology, or alternatively based on similarity in threedimensional conformation and the presence of an active-site loopstructurally and functionally equivalent to Cys . . . Cys active-siteloop of E. coli thioredoxin. Another thioredoxin-like sequence is a DNAsequence which encodes all or a portion of an adult T-cellleukemia-derived factor (ADF) or other species homologs thereof(Wakasugi et al., Proc. Natl. Acad. Sci. USA, 87:8282-8286 (1990)). ADFis now believed to be human thioredoxin. Similarly, the proteinresponsible for promoting disulfide bond formation in the periplasm ofE. coli , the product of the dsbA gene (Bardwell et al., Cell 67:581-89,1991) also can be considered a thioredoxin-like sequence. The threelatter publications are incorporated herein by reference for the purposeof providing information on PI-PLC, ERp72, ADF, and dsbA which are knownand available to one of skill in the art.

[0092] It is expected from the definition of thioredoxin-like sequencesused above that other sequences not specifically identified above, orperhaps not yet identified or published, may be useful asthioredoxin-like sequences based on their amino acid sequence homologyto E. coli thioredoxin or based on having three dimensional structuressubstantially similar to E. coli or human thioredoxin and having anactive-site loop functionally and structurally equivalent to the Cys . .. Cys active-site loop of E. coli thioredoxin. One skilled in the artcan determine whether a molecule has these latter two characteristics bycomparing its three-dimensional structure, as analyzed for example byx-ray crystallography or two-dimensional NMR spectroscopy, with thepublished three-dimensional structure for E. coli thioredoxin and byanalyzing the amino acid sequence of the molecule to determine whetherit contains an active-site loop that is structurally and functionallyequivalent to the Cys . . . Cys active-site loop of E. coli thioredoxin.By “substantially similar” in three-dimensional structure orconformation is meant as similar to E. coli thioredoxin as isglutaredoxin. In addition a predictive algorithm has been describedwhich enables the identification of thioredoxin-like proteins viacomputer-assisted analysis of primary sequence (Ellis et al.,Biochemistry 31:4882-91 (1992)). Based on the above description, one ofskill in the art will be able to select and identify, or, if desired,modify, a thioredoxin-like DNA sequence for use in this inventionwithout resort to undue experimentation. For example, simple pointmutations made to portions of native thioredoxin or nativethioredoxin-like sequences which do not effect the structure of theresulting molecule are alternative thioredoxin-like sequences, as areallelic variants of native thioredoxin or native thioredoxin-likesequences.

[0093] DNA sequences which hybridize to the sequence for E. colithioredoxin or its structural homologs under either stringent or relaxedhybridization conditions also encode thioredoxin-like proteins for usein this invention. An example of one such stringent hybridizationcondition is hybridization at 4× SSC at 65° C., followed by a washing in0.1× SSC at 65° C. for an hour. Alternatively an exemplary stringenthybridization condition is in 50% formamide, 4× SSC at 42° C. Examplesof non-stringent hybridization conditions are 4× SSC at 50° C. orhybridization with 30-40% formamide at 42° C. The use of all suchthioredoxin-like sequences are believed to be encompassed in thisinvention.

[0094] It may be preferred for a variety of reasons that prey proteinsbe fused within the active-site loop of thioredoxin or thioredoxin-likemolecules. The face of thioredoxin surrounding the active-site loop hasevolved, in keeping with the protein's major function as a nonspecificprotein disulfide oxido-reductase, to be able to interact with a widevariety of protein surfaces. The active-site loop region is foundbetween segments of strong secondary structure and this provides a rigidplatform to which one may tether prey proteins.

[0095] A small prey protein inserted into the active-site loop of athioredoxin-like protein is present in a region of the protein which isnot involved in maintaining tertiary structure. Therefore the structureof such a fusion protein is stable. Indeed, E. coli thioredoxin can becleaved into two fragments at a position close to the active-site loop,and yet the tertiary interactions stabilizing the protein remain.

[0096] The active-site loop of E. coli thioredoxin has the sequence NH₂. . . Cys₃₃-Gly-Pro-Cys₃₆. . . COOH. Fusing a selected prey protein witha thioredoxin-like protein in the active loop portion of the proteinconstrains the prey at both ends, reducing the degrees of conformationalfreedom of the prey protein, and consequently reducing the number ofalternative structures taken by the prey. The inserted prey protein isbound at each end by cysteine residues, which may form a disulfidelinkage to each other as they do in native thioredoxin and further limitthe conformational freedom of the inserted prey.

[0097] In addition, by being positioned within the active-site loop, theprey protein is placed on the surface of the thioredoxin-like protein,an advantage for use in screening for bioactive protein conformationsand other assays. In general, the utility of thioredoxin or otherthioredoxin-like proteins is described in McCoy et al., U.S. Pat. No.5,270,181 and LaVallie et al., Bio/Technology 11:187-193 (1993). Thesetwo references are hereby incorporated by reference.

[0098] There now follows a description of thioredoxin interaction trapsystems according to the invention. These examples are designed toillustrate, not limit, the invention.

Thioredoxin Interaction Trap System

[0099] Interaction trap systems utilizing conformationally-constrainedproteins have been developed for the detection of protein interactions,the identification and isolation of proteins participating in suchinteractions, the identification and isolation of agonists andantagonists of such interactions, and the identification and isolationof interacting peptide aptamers that may be used in protein detectionassays in a manner analogous to antibody-type reagents. Exemplarysystems are now described.

[0100] 1. Thioredoxin Interaction Trap with Cdk2 bait

[0101] Progression of eukaryotic cells through the cell cycle requiresthe coordinated action of a number of regulatory proteins that interactwith and regulate the activity of Cdks (Sherr, Cell 79:551-555 (1994)).These modulatory proteins include cyclins, which positively regulate Cdkactivity, Cyclin Dependent kinase inhibitors (Ckis), and a number ofprotein kinases and phosphatases, some of which, such as CAK and Cdc25,positively regulate kinase activity, some of which, such as Weel,inhibit kinase activity, and some of which, such as Cdil (Gyuris et al.,Cell 75:791-803 (1993)), have effects that are so far unknown (reviewedin Morgan, Nature 374:131-134 (1995)). Cdk2 is thought to be requiredfor higher eukaryotic cells to progress from GI into S-phase (Fang &Newport, J. Cell Biol. 66:731-742 (1991); Pagano et al., J. Cell Biol.121:101-111 (1993); van den Heuvel & Harlow, Science 262: 2050-2054(1993)). Cdk2 kinase activity is positively regulated by Cyclin E andCyclin A (Koff et al., Science 257:1689-1694 (1992); Dulic et al.,Science 257:1958-1961 (1992); Tsai et al., Nature 353:174-7 (1991)) andnegatively regulated by p21, p27 and p57 (Harper et al., Cell 75:805-816(1993); Polyak et al., Genes Dev. 8:9-22 (1994); Toyoshima & Hunter,Cell 78:67-74 (1994); Matsuoka et al., Genes Dev. 9:660-662 (1995); Leeet al., Genes Dev. 9:639-649 (1995)); in addition, Cdk2 complexes withCdi1 at the G1 to S transition (Gyuris et al., supra). Here we describethe use of a yeast two-hybrid system to select molecules which recognizeCdk2 from combinatorial libraries.

[0102] A prey vector is constructed containing the E. coli thioredoxingene (trxA). pJG 4-4 (Gyuris et al., supra) is used as the vectorbackbone and cut with EcoRI and XhoI. A DNA fragment encoding the B112transcription activation domain is obtained by PCR amplification ofplasmid LexA-B112 (Doug Ruden, Ph.D. thesis, Harvard University, 1992)and cut with MunI and NdeI. The E. coli trxA gene is excised from thevector pALTRXA-781 (U.S. Pat. No. 5,292,646; InVitrogen Corp., SanDiego, Calif.) by digestion with NdeI and SalI. The trxA and B112fragments are then ligated by standard techniques into theEcoRI/XhoI-cut pJG 4-4 backbone, forming pYENAeTRX. This vector encodesa fusion protein comprising the SV40 nuclear localization domain, theB112 transcription activation domain, an hemagglutinin epitope tag, andE. coli thioredoxin (FIG. 2).

[0103] Peptide libraries are constructed as follows. The DNA oligomer 5′GACTGACTGGTCCG(NNK)₂₀GGTCCTCAGTCAGTCAG 3′ (with N =A, C, G, T and K =G,T) (SEQ ID NO: 4) is synthesized and annealed to the second oligomer (5′CTGACTGACTGAGGACC 3′) (SEQ ID NO: 5) in order to form double strandedDNA at the 3′ end of the first oligomer. The second strand isenzymatically completed using Klenow enzyme, priming synthesis with thesecond oligomer. The product is cleaved with AvaII, and inserted intoRsrII cut pYENAeTRX. After ligation, the construct is used to transformE. coli by standard methods (Ausubel et al., Current Protocols inMolecular Biology, (Greene and Wiley-interscience, New York,1987-1994)). The library contained 2.9×10⁹ members, of which more than10⁹ directed the synthesis of peptides. Twenty-mers were chosen aspreferred peptides because they were long enough to fold into manydifferent patterns of shape and charge and short enough that many of theencoding oligonucleotides lacked stop codons. Because of the presence offortuitous restriction sites in some coding oligonucleotides and becausesome library members contained double inserts, approximately one fifthof the constrained peptides were longer or shorter than unit length.

[0104] To screen for interacting peptides or “aptamers,” 100 μg of thelibrary was used to transform the yeast strain EGY48 (Mata his3leu2::2Lexop-LEU2 ura3 trp1 LYS2; Gyuris et al., supra). This strainalso contained the reporter plasmid pSH 18-34, a pLR1A1 derivative,containing the yeast 2μ replication origin, the URA3 gene, and aGAL1-lacZ reporter gene with the GAL1 upstream regulatory elementsreplaced with 4 colEl LexA operators (West et al., Mol. Cell Biol.4:2467, 1984; Ebina et al., J. Biol. Chem. 258:13258, 1983; Hanes andBrent, Cell 57:1275, 1989), as well as the bait vector pLexA202-Cdk2(Cdk2 encodes the human cyclin dependent kinase 2, an essential cellcycle enzyme) (Gyuris et al., supra; Tsai et al., Oncogene 8:1593,1993). About 2.5 x 106 transformants are obtained and pooled. The firstselection step, growth on leucine-deficient medium after induction with2% galactose/1% raffinose (Gyuris et al., supra; Guthrie and Fink, Guideto Yeast Genetics and Molecular Biology, Vol. 194, 1991), was performedwith an 8-fold redundancy (20×10⁶ cfu) of the library in yeast, andabout 900 colonies were obtained after growth at 30° C. for 5 days. The300 largest colonies were streak purified and tested for thegalactose-dependent expression of the LEU2 gene product and ofβ-galactosidase (encoded by pSH 18-34), the latter giving rise to blueyeast colonies in the presence of Xgal in the medium (Ausubel et al.,supra). Thirty-three colonies fulfilled these requirements which, aftersequencing, included 14 different clones, all of which boundspecifically to a LexA-Cdk2 bait but not to LexA or to a LexA-Cdk3 bait(Finley et al., Proc. Natl. Acad. Sci. USA 91:12980-12984 (1994)). Thestrength of binding was judged according to the intensity of the bluecolor formed by a colony of the yeast that contained each differentinteractor. By this means, each interactor was classified as a strong,medium, or weak binder, which was normalized to the amount of blue colorcaused by the various naturally-occurring partner proteins of Cdk2 inside by side mating interaction assays. An example of the peptidesequence of one representative of each class is given here: Strongbinder: peptide 3 -Gly₃₄-Pro₃₅-Leu-Val-Cys-Lys-Ser-Tyr- (SEQ ID NO: 6)Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu- Phe-Arg-Ser-Leu-Phe-Gly₃₄-Pro₃₅-Medium binder: peptide 2 -Gly₃₄-Pro₃₅-Met-Val-Val-Ala-Ala-Glu- (SEQ IDNO: 7) Ala-Val-Arg-Thr-Val-Leu-Leu-Ala-Asp-Gly-Gly-Asp-Val-Thr-Gly₃₄-Pro₃₅- Weak binder: peptide 6-Gly₃₄-Pro₃₅-Pro-Asn-Trp-Pro-His-Gln- (SEQ ID NO: 8)Leu-Arg-Val-Gly-Arg-Val-Leu-Trp-Glu- Arg-Leu-Ser-Phe-Glu-Gly₃₄-Pro₃₅-

[0105] Control peptides which do not bind detectably are: c4:Arg-Arg-Ala-Ser-Val-Cys-Gly-Pro-Leu-Leu-Ser-Lys-Arg-Gly-Tyr-GlyPro-Pro-Phe-Tyr-Leu-Ala-Gly-Met-Thr-Ala-Pro-Glu-Gly-Pro-Cys (SEQ ID NO:14) and c: Arg-Arg-Ala-Ser-Val-Cys-Gly-Pro-Leu-His-Tyr-Trp-Gly-Leu-Gly-Gly-Phe-val-Asp-Leu-Trp-Gln-Glu-Thr-Thr-Gly-Val-Gly-Pro-Cys (SEQ ID NO: 15).

[0106]FIG. 3A shows that 5 of the peptide aptamers reacted strongly withthe LexA-Cdk2 bait but not with a large number of unrelated proteins.None of the Cdk2 aptamers interacted with CDC28 or Cdc2, which are both65% identical to Cdk2. However, 2 of the 5 Cdk2 interactors alsointeracted with human Cdk3, and 1 of the 5 also interacted withDrosophila Cdc2c, suggesting that these peptides recognize determinantscommon to these proteins. Both theoretical considerations andcalibration experiments with lambda repressor's C terminus suggestedthat transcription of the pSH18-34 reporter in EGY48 can be activated byprotein interactions with Kds as weak as 10⁻⁶M. The fact that peptides 3and 13 directed robust transcription of the this LexAop-lacZ reporterwas consistent with the idea that they may interact significantly moretightly. The sequence of these peptides is shown in FIG. 3B.

[0107] In related experiments, 6 additional aptamers (i.e., pep6 (SEQ IDNO: 21), pep7 (SEQ ID NO: 22), pep9 (SEQ ID NO: 23), pep12 (SEQ ID NO:24), pep13 (SEQ ID NO: 25), and pep14 (SEQ ID NO: 26) were shown tointeract with the LexA-Cdk2 bait but not with unrelated proteins such asMax or Rb, or with certain Cdk family members such as Cdk4, which shares47% sequence identity with Cdk2 (FIG. 4A). However, some aptamersinteracted with other Cdk family members. The fact that differentpeptide aptamers showed distinct patterns of cross-reactivity withdifferent Cdks indicated that these aptamers recognized differentepitopes conserved among various Cdks. The sequence of the peptide loopsis shown in FIG. 4B. Non-unit-length peptides occurred at the samefrequency among the Cdk2 interacting aptamers as in the library as awhole. No aptamer showed significant sequence similarity to knownproteins, as expected if the 20-mer peptides indeed formed novelrecognition structures. All of the peptides were charged, suggestingthat some of their interactions with the Cdk2 target could be ionic.

[0108] To confirm the specificity of the Cdk2 interaction, a Gst-Cdk2fusion protein was immobilized on glutathione sepharose beads, and thesebeads were used to specifically precipitate bacterially expressedpeptide aptamers. One set of results is shown in FIG. 5, and another setin FIG. 6.

[0109] For the FIG. 5 results, Gst-Cdk2 was expressed in E. coli andpurified on glutathione sepharose as previously described (Lee et al.,Nature 374:91-94 (1995)). The peptides were generated as follows:fragments that directed the synthesis of peptides 3 and 13 were made byPCR amplification of the insert encoded by the corresponding libraryplasmid and introduced into pAL-TrxA (LaVallie et al., supra). Fusionproteins were expressed and lysed in a French pressure cell aspreviously described (LaVallie et al., supra). Coprecipitation wascarried out using Gst-Sepharose beads as described in Lee et al.(supra), and samples were run on 15% SDS polyacrylamide gels andtransferred to nylon membranes. TrxA-containing fusion proteins werevisualized by probing the membranes with an anti-TrxA antibody, followedby treatment of the immobilized antibody with peroxidase-coupledanti-rabbit IgG antibody ECL reagents according to the manufacturer'sinstructions (Amersham, Arlington Heights, Ill.).

[0110] For the FIG. 6 results, Gst and Gst-Cdk2 were purified asdescribed (Lee et al., Nature 374:91-94 (1995)). pALHISTRX wasconstructed by annealing the oligonucleotides

[0111] 5′ TAATGAGCGATAAACACCACCACCACCACCACGACGACGACGACAAAGG3′ (SEQ IDNO: 27) and

[0112] 5′ TACCTTTGTCGCTGTCGTCGTGGTGGTGGTGGTGGTGTTTATCGCTCATTA3′ (SEQ IDNO: 28), and ligating into NdeI-cut pALTRX-781 (LaVallie et al., supra).AvaII fragments encoding peptide loops were then cloned from the libraryplasmids into RsrII-cut pALHISTRX. His6-TrxA and His6-aptamers wereexpressed in GI724 as previously described (Ausubel et al., supra), theproteins were purified on Ni²⁺-NTA-Agarose according to manufacturer'sdirections (Qiagen, Chatsworth, Calif.), and then dialyzed against 10 mMHepes pH 7.4/50 mM NaCl. 1 μg of His6-TrxA or His6-aptamers wasprecipitated with Gst or Gst-Cdk2 sepharose beads as described (Lee etal., supra), and the products detected by Western blot analysis with ananti-TrxA rabbit antiserum and ECL reagents (Amersham, ArlingtonHeights, Ill.).

[0113] The results shown in FIGS. 5 and 6 demonstrated that theinteractions between Cdk2 and the peptide aptamers could be observed invitro, and were thus independent of any bridge proteins native to yeast.

[0114] To determine the binding affinities of these aptamers for Cdk2,the following experiments were carried out. Based on interpolation frominteraction trap calibration experiments (Estojak et al., Mol. Cell.Biol. 15:5820-5829 (1995)), the robust transcription that some of theaptamers of FIGS. 4A and 4B directed from the pSH18-34 reportersuggested that the equilibrium dissociation constants (Kds) of theinteractions was <10⁻⁶M. In order to precisely measure the bindingaffinity of the aptamers to Cdk2, we used an evanescent wave instrument(BIAcore, Pharmacia, Piscataway, N.J.). Purified His6-Cdk2 was coupledto CM-dextran chips, and peptide aptamers flowed in running buffer overthe chips. Following binding, the chips were rinsed with running bufferlacking aptamer.

[0115] In particular, in these experiments, HIS6-Cdk2 was cross-linkedin 10 mM MES pH 6.1/50 mM NaCl to CM5 chips with an amine-coupling kit(Pharmacia, Piscataway, N.J.). Purified aptamers were then flowed inrunning buffer (Hepes 10 mM pH 7.4/50 mM NaCl) onto the chips at 5μl/minute, and association and dissociation of the His6-Cdk2-aptamercomplexes recorded as variations in resonance angle with time.Association phase started upon aptamer injection, and dissociation phaseupon running buffer injection. Portions of association and dissociationcurves were then fitted that excluded the sudden variations in resonanceangle caused by transitions between running buffer andaptamer-containing running buffer, which differed slightly in refractiveindex (“buffer fluxes”).

[0116] Association and dissociation rate constants were determined byfitting the association and dissociation phases of at least two runs(and typically four runs) for each aptamer to exponential functionsusing the data analysis Program IGOR (Wavemetrics, Inc., Lake Oswego,Oreg.) and a non-linear least squares algorithm as described inO'Shannessy et al. (Anal. Biochem. 212:457-468 (1993). Kds werecalculated by dividing dissociation rate constants by association rateconstants. A representative wave instrument run is shown in FIG. 7, andTable 1 indicates that, under the conditions described above, allaptamers exhibited Kds between 30 and 120 nM. TABLE 1 Dissociation rateAssociation rate Aptamer constant x10⁻⁶ (s⁻¹) constant (M⁻¹ s⁻¹) Kd (nM)Pep 2 480 +/− 109 7474 +/− 270  64 +/− 12 Pep 3 246 +/− 20 2201 +/− 160112 +/− 1 Pep 5 428 +/− 16 8263 +/− 215  52 +/− 1 Pep 8 120 +/− 15 3122+/− 23  38 +/− 5 Pep 10 693 +/− 64 6555 +/− 28 105 +/− 10 Pep 11 484 +/−25 5590 +/− 168  87 +/− 7

[0117] The ability to select TrxA-peptides that interact specificallywith designated intracellular baits allows for the creation of otherclasses of intracellular reagents. For example, appropriatelyderivitized TrxA-peptide fusions may allow the creation of antagonistsor agonists (as described above). Alternatively, peptide fusions allowfor the creation of homodimeric or heterodimeric “matchmakers,” whichforce the interaction of particular protein pairs. In one particularexample, two proteins are forced together by utilizing a leucine zippersequence attached to a conformation-constraining protein containing acandidate interaction peptide. This protein can bind to both members ofa protein pair of interest and direct their interaction. Alternatively,the “matchmaker” may include two different sequences, one havingaffinity for a first polypeptide and the second having affinity for thesecond polypeptide; again, the result is directed interaction betweenthe first and second polypeptides. Another practical application for thepeptide fusions described herein is the creation of “destroyers,” whichtarget a bound protein for destruction by host proteases. In an exampleof the destroyer application, a protease is fused to one component of aninteracting pair and that component is allowed to interact with thetarget to be destroyed (e.g., a protease substrate). By this method, theprotease is delivered to its desired site of action and its proteolyticpotential effectively enhanced. Yet another application of the fusionproteins described herein are as “conformational stabilizers,” whichinduce target proteins to favor a particular conformation or stabilizethat conformation. In one particular example, the ras protein has oneconformation that signals a cell to divide and another conformation thatsignals a cell not to divide. By selecting a peptide or protein thatstabilizes the desired conformation, one can influence whether a cellwill divide. Other proteins that undergo conformational changes whichincrease or decrease activity can also be bound to an appropriate“conformational stabilizer” to influence the property of the desiredprotein.

[0118] 2. Functional Inhibition of Cdk2

[0119] To determine whether Cdk2 interacting peptides might inhibit Cdk2function in vivo, we took advantage of the fact that human Cdk2 cancomplement temperature sensitive alleles of Cdc28 (Elledge andSpottswood, EMBO 10:2653-2659, 1991; Ninomiya et al., PNAS 88:9006-9010,1991; Meyerson et al., EMBO 11:2909-2917, 1992). Peptide 13 inhibits theplating efficiency of a Cdk2-dependent yeast. A strain carrying thetemperature sensitive cdc28-lN mutation can form colonies at hightemperature if it carries a plasmid that expresses Cdk2. At therestrictive temperature, compared to the plating efficiency of yeastexpressing control peptides, expression of peptide 13 diminishes theplating efficiency of this strain by 10-fold. Both peptide 3 and 13 havesimilar effects on the plating efficiency at 37° C. of a Cdk2(+) strainthat carries the cdc28-13ts allele.

[0120] Expression of peptide 13 slows the doubling time of a Cdk2(+),cdc28ts-IN strain by a factor of 50%. Microscopic examination of strainsexpressing the peptide revealed that a high proportion of these cellshad an elongated morphology characteristic of cdc28-1N cells at therestrictive temperature, whereas cells expressing a control peptide hada more normal morphology.

[0121] Peptide 13 does not affect the growth of a cdc28-1Nts strain athigh temperature when the defect is complemented by a plasmid expressingwild-type Cdc28 product, and has no effect on yeast at the permissivetemperature. While we do not intend to be bound by any particulartheory, it appears that this peptide blocks yeast cell cycle progressionby binding to some face of the Cdk2 molecule and inhibiting its functionand thereby interfering with its ability to interact with cyclins, otherpartners, or with substrates.

[0122] In later experiments with the aptamers of FIG. 4B, inhibition ofCdk2 activity by these peptides (for example, by binding to a face ofthe molecule and by blocking its interaction with one of its partnerproteins or substrates) was examined. In particular, the ability of theaptamers to inhibit phosphorylation of Histone H1 by Cdk2/Cyclin Ekinase was tested. To carry out these experiments, 2×10⁷ Sf9 cells wereco-infected with recombinant bacculoviruses expressinghemagglutinin-tagged Cdk2 and His6-Cyclin E as described (Kato et al.,Genes & Dev. 7:331-342 (1993); Desai et al., Mol. Biol. Cell 3:571-582(1992)). Cells were lysed 40 hours after infection in 500 μl of 1×Kinase Buffer (Kato et al., supra), and 5 μl of a 100-fold dilutedextract was used in 30 μl reactions. Reactions were carried out for 20minutes at 25° C. by adding 2.5 μCi of [γ³²P] ATP (3000 Ci/mmol), 25 μMATP, 10 ong of Histone H1 (Sigma, St. Louis, Mo.), and varying amountsof His6-TrxA or His6-aptamers. Samples were run on 15% SDS-PAGE gels andexposed by autoradiography.

[0123] The results of these experiments are shown in FIG. 8. All testedaptamers were able to inhibit phosphorylation of Histone H1 byCdk2/Cyclin E kinase. Under standard conditions (pH 7.5, 0 mM NaCl)(Kato et al., supra), apparent half-inhibitory concentrations rangedfrom 1.5 to 100 nM. To rule out the possibility that a trace bacterialcontaminant was responsible for the inhibition, we removed theHis6-peptide aptamer from the Pep2 preparation with a rabbit polyclonalanti-thioredoxin antiserum; this immunodepleted preparation no longerinhibited Cdk2 kinase activity. Half-inhibitory concentrations ofaptamers were lower than the Kds measured from evanescent waveexperiments, consistent with the idea that some of the energy of eachinteraction is ionic and is reduced by the salt in the evanescent waveinstrument running buffer.

[0124] In co-precipitation experiments (Reymond et al., oncogene11:1173-1178 (1995)), purified Pep2 did not compete with invitro-translated Cyclin E for binding to in vitro-translated Cdk2.However, inhibition by Pep2 was reversed by addition of a 10-fold excessof Histone H1, suggesting that at least Pep2 inhibits kinase activity bycompeting with its H1 substrate.

[0125] Previous studies have established that libraries of unconstrainedpeptides contain sequences capable of recognizing targets in vitro(Devlin et al., Science 249:404-406 (1990); Cwirla et al., Proc. Natl.Acad. Sci. USA 87:6378-6382 (1990); Lam et al., Nature 354:82-84 (1991);Songyang et al., Current Biology 4:973-982 (1994); Scott et al., CurrentBioloqy 5:40-48 (1994)) and in yeast (Yang et al., Nucl. Acids. Res.23:1152-1156 (1995)); such isolated peptide sequences often bearsimilarity to natural interactors. By contrast, although constrainedpeptide libraries are less conformationally diverse (McConnell et al.,Gene 151:115-118 (1994)), the lack of conformational diversity shouldlower the entropic cost if binding causes the loop to adopt a singleconformation (Spolar et al., Science 263:777-784 (1994)); this reductionin entropic cost may account for the fact that our Cdk2 peptide aptamersrecognize their targets with higher affinity than is typically observedfor unconstrained peptides (Yang et al., supra; Oldenburg et al., Proc.Natl. Acad. Sci. USA 89:5393-5397 (1992); McLafferty et al., Gene128:29-36 (1993)). This high affinity suggests that peptide aptamers mayinhibit protein function in vivo, in the simplest case by binding tospecific faces of the target molecule and disrupting its interactionwith specific partners or effectors.

[0126] The ability to generate large numbers of aptamers fromcombinatorial libraries, taken together with the interaction trap, whichoffers a powerful selection for those that bind specific proteins,facilitates the selection of peptide aptamers against a variety ofintracellular targets. Aptamers which inhibit protein contacts can beused to aid the dissection of the networks of protein interactions thatgovern division of higher eukaryotic cells and can also be used for thegenetic analysis of those metazoan organisms for which isolation ofspecific missense alleles may be impractical. The analogy of theaptamers of the invention with antibodies indicates that peptideaptamers can also be used in other applications in which immunologicalreagents are now employed, such as ELISAs, immunofluorescenceexperiments, and sensors. If desired, the affinity of these aptamers maybe increased, for example, by increasing their valency and usingexisting interaction technology to select mutants that bind moretightly. This first generation of peptide aptamers facilitates theproduction of recognition modules for intracellular nanotechnologiesaimed at destroying, modifying, and assembling macromolecules insidecells.

[0127] 3. Thioredoxin Interaction Trap with OncoRas Bait

[0128] The ras proteins are essential for many signal transductionpathways and regulate numerous physiological functions including cellproliferation. The ras genes were first identified from the genome ofHarvey and Kirsten sarcoma virus. The three types of mammalian ras genes(N-, K-ras, and H-ras) encode highly conserved membrane-bound guaninenucleotide binding proteins with a molecular mass of 21 kDa, which cyclebetween the active (GTP-bound) form and the inactive (GDP-bound) form.

[0129] In normal cells, the active form of Ras is short-lived, as itsintrinsic GTPase activity rapidly converts the bound-GTP to GDP. TheGTPase activity is stimulated 10⁵-fold by GTPase-activating proteins(GAPs). GTP-bound Ras interact with GAP, c-Raf, neurofibromatosis type 1(NF-1) and Ral guanine nucleotide dissociation stimulator (RalGDS).

[0130] Mutationally-activated RAS proteins are found in about 30% ofhuman tumor cells and have greatly decreased GTPase activity which cannot be stimulated by GAPs. The majority of mutations studied thus farare due to a point mutation at either residue Gly-12 or residue Gln-61of Ras. These Ras mutants remain in the active form and interact withthe downstream effectors to result in tumorigenesis. It has been shownthat there are significant conformational differences between GTP-boundforms of wild-type and oncogenic RAS proteins. Such conformationaldifferences are likely causes for malignant transformation induced byoncogenic ras proteins.

[0131] Such mutationally-activated conformational changes in GTP-boundH-ras mutants provide targets for members of a conformationallyconstrained random peptide library. In the present example, the libraryis a conformationally constrained thioredoxin peptide library, asdescribed above. Library members, which interact with oncogenic Ras havebeen identified using a variation of the interaction trap technologyprovided above. The oncogenic Ras peptide aptamers isolated may beassayed for their ability to disrupt the interaction of oncogenic Raswith known effectors and to inhibit cellular transformation.

[0132] We have used well-characterized oncogenic H-ras(G12V) forisolation and characterization of its peptide aptamers. Peptide aptamersfor other oncogenes can be isolated using adaptations of this protocolas provided herein.

Bait Construction

[0133] Construction of LexA-Ras(G12V)/pEG202:H-Ras(G12V) DNA wasperformed by digesting BTM116-H-Ras(G12V) (FIG. 9) with BamHI and SalI.H-Ras(G12V) DNA was ligated with pEG202 backbone digested with BamHI andSalI. The resulting plasmid was called pEG202-H-Ras(G12V) (or V6) (FIG.10). Screening for H-Ras(G12V) peptide aptamers pEG202-H-Ras(G12V) (V6)was transformed into the EGY48 strain according to a standard yeasttransformation protocol; in particular, the protocol provided by ZymoResearch (Orange County, Calif.) was used here. EGY48 was grown in YPDmedium to OD₆₀₀=0.2-0.7. Cells were pelleted at 500 × g for 4 min. andresuspended in 10 ml of EZ1 solution (Zymo Research). The cells werethen pelleted by centrifugation and resuspended in 1 ml of EZ2 (ZymoResearch). Aliquots of competent cells (50 μl) were stored in a −70° C.freezer.

[0134] An aliquot of competent cells was mixed with 0.1 Mg ofLexA-H-Ras(G12V)/pEG202 and 500 μl of EZ3 solution (Zymo Research). Themixture was incubated at 30° C. for 30 min. and plated onto a yeastmedium lacking histidine and uracil. One colony was picked andinoculated into 100 ml of glucose Ura-His- medium at 30° C. with shaking(150 rpm) until the OD₆₀₀ measurement was 0.96. The culture wascentrifuged at 2000 g for 5 min and cell pellets were resuspended in 5ml of sterile LiOAc/TE. The cells were again centrifuged as above andresuspended in 0.5 ml of sterile LiOAc/TE.

[0135] Aliquots (50 μl) of the cells were then incubated at 30° C. for30 min. with 1 μg of thioredoxin peptide library DNA, 70 μg of salmonsperm DNA, and 300 μl of sterile 40% PEG 4000 in LiOAc/TE. The mixtureswere heat-shocked at 42° C. for 15 min. Each aliquot was plated onto a24 cm×24 cm plate containing glucose Ura-His-Trp- medium and wasincubated at 30° C. for two days. The transforming efficiency typicallyranged from 50,000 to 100,000 colony forming units per μg of libraryDNA.

[0136] A total of 1.5 million transformants were obtained and wereplated onto the selection medium of galactose/raffinoseLeu-Ura-His-Trp-. Of the 338 colonies formed, among them 50 wererandomly picked and inoculated into 5 ml of glucose Leu-Ura-His-Trp-medium for preparation of yeast plasmid DNA. A half ml of each yeastculture was mixed with an equal volume of acid-washed sand andphenol/chloroform/isoamyl alcohol (24:24:1), and vortexed in a vortexerfor 2 min. The mixture was then centrifuged for 15 min., and thesupernatant was precipitated with ethanol. DNA pellets were resuspendedin 50 μl of TE.

[0137] One μl of each sample was used to transform E. coli KC8 cells byelectroporation. Bacterial transformants were selected on minimal agarsupplemented with uracil, leucine, histidine, and ampicillin. Each typetransformant resulted in final isolation of plasmid which a leucinemarker, which carries a DNA fragment encoding thioredoxin-peptide fusionprotein.

[0138] Sequence determination of the 50 isolates was carried outaccording to the directions of the fmolDNAw sequencing systems (Promega,Madison, Wisc.) using primer 5′-GACGGGGCGATCCTCGTCG-3′ (SEQ ID NO:16).Nine out of 50 isolates (referred to as #4, #18, #39, #41, #22, #24,#30, #31, #46) contained unique peptide encoding sequences, asdetermined by electrophoresis of the dT/ddT termination reaction. Amongthem, the predicted peptide aptamer sequence of #39 is as follows:Trp-Ala-Glu-Trp-Cys-Gly-Pro-Val-Cys-Ala-His-Gly-Ser-Arg-Ser-Leu-Thr-Leu-Leu-Thr-Lys-Tyr-His-Val-Ser-Phe-Leu-Gly-Pro-Cys-Lys-Met-Ile-Ala-Pro-Ile-Leu-Asp (SEQ ID NO:17).From our results, it appears that approximately 60 unique H-Ras(G12V)peptide aptamers (338×9/50) were isolated in the first round ofscreening.

Other Embodiments

[0139] As described above, the invention features a method for detectingand analyzing protein-protein interactions. Typically, in the aboveexperiments, the bait protein is fused to the DNA binding domain, andthe prey protein (in association with the conformation-constrainingprotein) is fused to the gene activation domain. The invention, however,is readily adapted to other formats. For example, the invention alsoincludes a “reverse” interaction trap in which the bait protein is fusedto a gene activation domain, and the prey protein (in association with aconformation-constraining protein) is fused to the DNA binding domain.Again, an interaction between the bait and prey proteins results inactivation of reporter gene expression. Such a “reverse” interactiontrap system, however, depends upon the use of prey proteins which do notthemselves activate downstream gene expression.

[0140] The protein interaction assays described herein can also beaccomplished in a cell-free, ln vitro system. Such a system may beginwith a DNA construct including a reporter gene operably linked to aDNA-binding-protein recognition site (e.g., a LexA binding site). Tothis DNA is added a bait protein (e.g., any of the bait proteinsdescribed herein bound to a LexA DNA binding domain) and a prey protein(e.g., one of a library of conformationally-constrained candidateinteractor prey proteins bound to a gene activation domain). Interactionbetween the bait and prey protein is assayed by measuring the reportergene product, either as an RNA product, as an in vitro translatedprotein product, or by some enzymatic activity of the translatedreporter gene product. Alternatively, interactions involvingconformationally constrained proteins may be carried out by direct invitro techniques, for example, by any standard physical or biochemicaltechnique for identifying protein interactions (such as immobilizationof a first protein on a column or other solid support and contact with aconformationally-constrained protein). These direct in vitro approachesare preferably carried out in such a way that the DNA encoding theconfromationally-constrained protein may be readily isolated, forexample, by using techniques involving phage display or display of theprotein on the E. coli flagella.

[0141] These in vitro systems may also be used to identify agonists orantagonists, simply by adding to a known pair of interacting proteins(in the above described system) a candidate agonist or antagonistinteractor and assaying for an increase or decrease (respectively) inreporter gene expression, as compared to a control reaction lacking thecandidate compound or protein. To facilitate large scale screening,candidate prey proteins or candidate agonists or antagonists may beinitially tested in pools, for example, of ten or twenty candidatecompounds or proteins. From pools demonstrating a positive result, theparticular interacting protein or agonist or antagonist is thenidentified by individually assaying the components of the pool. Such invitro systems are amenable to robotic automation or to the production ofkits. Kits including the components of any of the interaction trapsystems described herein are also included in the invention.

[0142] In one particular embodiment, interacting proteins identified invitro are tested for their ability to interact in vivo. Such in vivointeracting proteins may be used for any diagnostic or therapeuticpurpose. For example, proteins shown to interact in vivo may be used todisrupt, encourage, or stablize intracellular interactions or may beused as an intracellular antibody-type reagent.

[0143] The components (e.g., the various fusion proteins or DNAtherefor) of any of the in vivo or in vitro systems of the invention maybe provided sequentially or simultaneously depending on the desiredexperimental design.

[0144] Other embodiments are within the following claims.

1 28 20 amino acid Not Relevant linear 1 Leu Val Cys Lys Ser Tyr Arg LeuAsp Trp Glu Ala Gly Ala Leu Phe 1 5 10 15 Arg Ser Leu Phe 20 20 aminoacid Not Relevant linear 2 Met Val Val Ala Ala Glu Ala Val Arg Thr ValLeu Leu Ala Asp Gly 1 5 10 15 Gly Asp Val Thr 20 20 amino acid NotRelevant linear 3 Pro Asn Trp Pro His Gln Leu Arg Val Gly Arg Val LeuTrp Glu Arg 1 5 10 15 Leu Ser Phe Glu 20 91 nucleic acid double linear Nis A or T or G or C; K is G or T. 4 GACTGACTGG TCCGNNKNNK NNKNNKNNKNNKNNKNNKNN KNNKNNKNNK NNKNNKNNKN 60 NKNNKNNKNN KNNKGGTCCT CAGTCAGTCA G91 17 nucleic acid double linear 5 CTGACTGACT GAGGACC 17 24 amino acidNot Relevant linear 6 Gly Pro Leu Val Cys Lys Ser Tyr Arg Leu Asp TrpGlu Ala Gly Ala 1 5 10 15 Leu Phe Arg Ser Leu Phe Gly Pro 20 24 aminoacid Not Relevant linear 7 Gly Pro Met Val Val Ala Ala Glu Ala Val ArgThr Val Leu Leu Ala 1 5 10 15 Asp Gly Gly Asp Val Thr Gly Pro 20 24amino acid Not Relevant linear 8 Gly Pro Pro Asn Trp Pro His Gln Leu ArgVal Gly Arg Val Leu Trp 1 5 10 15 Glu Arg Leu Ser Phe Glu Gly Pro 20 20amino acid Not Relevant linear 9 Ser Val Arg Met Arg Tyr Gly Ile Asp AlaPhe Phe Asp Leu Gly Gly Leu 1 5 10 15 Leu His Gly 20 42 amino acid NotRelevant linear 10 Glu Leu Arg His Arg Leu Gly Arg Ala Leu Ser Glu AspMet Val Arg Gly 1 5 10 15 Leu Ala Trp Gly Pro Thr Ser His Cys Ala ThrVal Pro Gly Thr Ser Asp 20 25 30 Leu Trp Arg Val Ile Arg Phe Leu 35 4020 amino acid Not Relevant linear 11 Tyr Ser Phe Val His His Gly Phe PheAsn Phe Arg Val Ser Trp Arg Glu 1 5 10 15 Met Leu Ala 20 20 amino acidNot Relevant linear 12 Gln Val Trp Ser Leu Trp Ala Leu Gly Trp Arg TrpLeu Arg Arg Tyr Gly 1 5 10 15 Trp Asn Met 20 20 amino acid Not Relevantlinear 13 Trp Arg Arg Met Glu Leu Asp Ala Glu Ile Arg Trp Val Lys ProIle Ser 1 5 10 15 Pro Leu Glu 20 31 amino acid Not Relevant linear 14Arg Arg Ala Ser Val Cys Gly Pro Leu Leu Ser Lys Arg Gly Tyr Gly 1 5 1015 Pro Pro Phe Tyr Leu Ala Gly Met Thr Ala Pro Glu Gly Pro Cys 20 25 3030 amino acid Not Relevant linear 15 Arg Arg Ala Ser Val Cys Gly Pro LeuHis Tyr Trp Gly Leu Gly Gly 1 5 10 15 Phe Val Asp Leu Trp Gln Glu ThrThr Gly Val Gly Pro Cys 20 25 30 19 nucleic acid double linear 16GACGGGGCGA TCCTCGTCG 19 38 amino acid Not Relevant linear 17 Trp Ala GluTrp Cys Gly Pro Val Cys Ala His Gly Ser Arg Ser Leu 1 5 10 15 Thr LeuLeu Thr Lys Tyr His Val Ser Phe Leu Gly Pro Cys Lys Met 20 25 30 Ile AlaPro Ile Leu Asp 35 20 amino acid Not Relevant linear 18 Leu Val Cys LysSer Tyr Arg Leu Asp Trp Glu Ala Gly Ala Leu Phe Arg 1 5 10 15 Ser LeuPhe 20 20 amino acid Not Relevant linear 19 Tyr Arg Trp Gln Gln Gly ValVal Pro Ser Asn Trp Ala Ser Cys Ser Phe 1 5 10 15 Arg Cys Gly 20 38amino acid Not Relevant linear 20 Ser Ser Phe Ser Leu Trp Leu Leu MetVal Lys Ser Ile Lys Arg Ala Ala 1 5 10 15 Trp Glu Leu Gly Pro Ser SerAla Trp Asn Thr Ser Gly Trp Ala Ser Leu 20 25 30 Ala Asp Phe Tyr 35 20amino acids amino acid Not Relevant linear protein 21 Arg Val Lys LeuGly Tyr Ser Phe Trp Ala Gln Ser Leu Leu Arg Cys 1 5 10 15 Ile Ser ValGly 20 20 amino acids amino acid Not Relevant linear protein 22 Gln LeuTyr Ala Gly Cys Tyr Leu Gly Val Val Ile Ala Ser Ser Leu 1 5 10 15 SerIle Arg Val 20 31 amino acids amino acid Not Relevant linear protein 23Gln Gln Arg Phe Val Phe Ser Pro Ser Trp Phe Thr Cys Ala Gly Thr 1 5 1015 Ser Asp Phe Trp Gly Pro Glu Pro Leu Phe Asp Trp Thr Arg Asp 20 25 3020 amino acids amino acid Not Relevant linear protein 24 Arg Pro Leu ThrGly Arg Trp Val Val Trp Gly Arg Arg His Glu Glu 1 5 10 15 Cys Gly LeuThr 20 20 amino acids amino acid Not Relevant linear protein 25 Pro ValCys Cys Met Met Tyr Gly His Arg Thr Ala Pro His Ser Val 1 5 10 15 PheAsn Val Asp 20 20 amino acids amino acid Not Relevant linear protein 26Trp Ser Pro Glu Leu Leu Arg Ala Met Val Ala Phe Arg Trp Leu Leu 1 5 1015 Glu Arg Arg Pro 20 49 base pairs nucleic acid single linear DNA 27TAATGAGCGA TAAACACCAC CACCACCACC ACGACGACGA CGACAAAGG 49 51 base pairsnucleic acid single linear DNA 28 TACCTTTGTC GCTGTCGTCG TGGTGGTGGTGGTGGTGTTT ATCGCTCATT A 51

1. A method of determining whether a first protein is capable ofphysically interacting with a second protein, comprising: (a) providinga host cell which contains (i) a reporter gene operably linked to aDNA-binding-protein recognition site; (ii) a first fusion gene whichexpresses a first fusion protein, said first fusion protein comprisingsaid first protein covalently bonded to a binding moiety which iscapable of specifically binding to said DNA-binding-protein recognitionsite; and (iii) a second fusion gene which expresses a second fusionprotein, said second fusion protein comprising said second proteincovalently bonded to a gene activating moiety and beingconformationally-constrained; and (b) measuring expression of saidreporter gene as a measure of an interaction between said first and saidsecond proteins.
 2. The method of claim 1, wherein said second proteinis a peptide of at least 6 amino acids.
 3. The method of claim 1,wherein said second protein is a peptide of less than or equal to 60amino acids in length.
 4. The method of claim 1, wherein said secondprotein comprises a randomly generated or intentionally de signedpeptide sequence.
 5. The method of claim 1, wherein said second proteinis conformationally-constrained because it is covalently bonded to aconformation-constraining protein.
 6. The method of claim 1, whereinsaid second protein comprises one or more loops.
 7. The method of claim1, wherein said first protein is Cdk2.
 8. The method of claim 1, whereinsaid first protein is Ras or an activated Ras.
 9. The method of claim 5,wherein said second protein is embedded within saidconformation-constraining protein.
 10. The method of claim 5, whereinsaid conformation-constraining protein is thioredoxin.
 11. The method ofclaim 5, wherein said conformation-constraining protein is athioredoxin-like molecule.
 12. The method of claim 10, wherein saidsecond protein is inserted into the active site loop of said thioredoxinprotein.
 13. The method of claim 1, wherein said second protein isconformationally-constrained by disulfide bonds between cysteineresidues in the amino-terminus and in the carboxy-terminus of saidsecond protein.
 14. The method of claim 1, wherein said host cell isyeast.
 15. The method of claim 1, wherein said DNA binding domain isLexA.
 16. The method of claim 1, wherein said reporter gene is assayedby a color reaction.
 17. The method of claim 1, wherein said reportergene is assayed by cell viability.
 18. A method of detecting aninteracting protein in a population of proteins, comprising: (a)providing a host cell which contains (i) a reporter gene operably linkedto a DNA-binding-protein recognition site; and (ii) a fusion gene whichexpresses a fusion protein, said fusion protein comprising a testprotein covalently bonded to a binding moiety which is capable ofspecifically binding to said DNA-binding-protein recognition site; (b)introducing into said host cell a second fusion gene which expresses asecond fusion protein, said second fusion protein comprising one of saidpopulation of proteins covalently bonded to a gene activating moiety andbeing conformationally-constrained; and (c) measuring expression of saidreporter gene.
 19. The method of claim 18, wherein said population ofproteins comprises short peptides of between 1 and 60 amino acids inlength.
 20. The method of claim 18, wherein said population of proteinsis a set of randomly generated or intentionally designed peptidesequences.
 21. The method of claim 18, wherein said population ofproteins is conformationally-constrained by covalently bonding to aconformation-constraining protein.
 22. The method of claim 18, whereinsaid population of proteins each comprises one or more loops .
 23. Themethod of claim 21, wherein each of said population of proteins isembedded within a conformation-constraining protein.
 24. The method ofclaim 21, wherein said conformation-constraining protein is thioredoxin.25. The method of claim 24, wherein each of said population of proteinsis inserted into the active site loop of said thioredoxin.
 26. Themethod of claim 18, wherein each of said population of proteins isconformationally-constrained by disulfide bonds between cysteineresidues in the amino-terminus and in the carboxy-terminus of saidprotein.
 27. The method of claim 18, wherein said first protein is Cdk2.28. The method of claim 18, wherein said first protein is Ras or anactivated Ras.
 29. The method of claim 18, wherein said host cell isyeast.
 30. The method of claim 18, wherein said DNA binding domain isLexA.
 31. The method of claim 18, wherein said reporter gene is assayedby a color reaction.
 32. The method of claim 18, wherein said reportergene is assayed by cell viability.
 33. A method of identifying acandidate interactor, comprising: (a) providing a reporter gene operablylinked to a DNA-binding-protein recognition site; (b) providing a firstfusion protein, said first fusion protein comprising a first proteincovalently bonded to a binding moiety which is capable of specificallybinding to said DNA-binding-protein recognition site; (c) providing asecond fusion protein, said second fusion protein comprising a secondprotein covalently bonded to a gene activating moiety and beingconformationally-constrained, said second protein being capable ofinteracting with said first protein; (d) contacting said candidateinteractor with said first protein and/or said second protein; and (e)measuring expression of said reporter gene.
 34. The method of claim 33,wherein providing said first fusion protein comprises providing a firstfusion gene which expresses said first fusion protein and whereinproviding said second fusion protein comprises providing a second fusiongene which expresses said second fusion protein.
 35. The method of claim33, wherein said first fusion protein and said second fusion protein arepermitted to interact prior to contact with said candidate interactor.36. The method of claim 33, wherein said first fusion protein and saidcandidate interactor are permitted to interact prior to contact withsaid second fusion protein.
 37. The method of claim 33, wherein saidcandidate interactor is conformationally-constrained.
 38. The method ofclaim 33, wherein said candidate interactor comprises one or more loops.39. The method of claim 33, wherein said candidate interactor is anantagonist and reduces reporter gene expression.
 40. The method of claim33, wherein said candidate interactor is an agonist and increasesreporter gene expression.
 41. The method of claim 33, wherein saidcandidate interactor is a member selected from the group consisting ofproteins, polynucleotides, and small molecules.
 42. The method of claim33, wherein said candidate interactor is encoded by a member of a cDNAor synthetic DNA library.
 43. The method of claim 33, wherein saidcandidate interactor is a mutated form of said first fusion protein orsaid second fusion protein.
 44. The method of claim 34, wherein saidreporter gene, said first fusion gene, and said second fusion gene areincluded on a single piece of DNA.
 45. The method of claim 33, whereinsaid method further comprises (f) determining whether said secondprotein interacts with said first protein inside a cell.
 46. The methodof claim 33, wherein said first protein is Cdk2.
 47. The method of claim33, wherein said first protein is Ras or an activated Ras.
 48. Apopulation of eukaryotic cells, each cell having a recombinant DNAmolecule encoding a conformationally-constrained intracellular peptide,there being at least 100 different recombinant molecules in saidpopulation, each molecule being in at least one cell of said population.49. The population of eukaryotic cells of claim 48, wherein saidintracellular peptide is conformationally-constrained because it iscovalently bonded to a conformation-constraining protein.
 50. Thepopulation of claim 49, wherein said intracellular peptide is embeddedwithin said conformation-constraining protein.
 51. The population ofclaim 49, wherein said intracellular peptide comprises one or moreloops.
 52. The population of eukaryotic cells of claim 49, wherein saidconformation-constraining protein is thioredoxin.
 53. The population ofeukaryotic cells of claim 48, wherein said intracellular peptide isconformationally-constrained by disulf ide bonds between cysteineresidues in the amino-terminus and in the carboxy-terminus of saidsecond protein.
 54. The population of eukaryotic cells of claim 48,wherein said cells are yeast cells.
 55. The population of eukaryoticcells of claim 48, wherein said recombinant DNA molecule further encodesa gene activating moiety covalently bonded to said intracellularpeptide.
 56. The population of eukaryotic cells of claim 48, whereinsaid intracellular peptide physically interacts with a secondrecombinant protein inside said eukaryotic cells.
 57. A method ofassaying an interaction between a first protein and a second protein,comprising: (a) providing a reporter gene operably linked to aDNA-binding-protein recognition site; (b) providing a first fusionprotein comprising said first protein covalently bonded to a bindingmoiety which is capable of specifically binding to saidDNA-binding-protein recognition site; (c) providing a second fusionprotein comprising said second protein covalently bonded to a geneactivating moiety and being conformationally-constrained; (d) combiningsaid reporter gene, said first fusion protein, and said second fusionprotein; and (e) measuring expression of said reporter gene.
 58. Themethod of claim 57, wherein providing said first fusion proteincomprises providing a first fusion gene which expresses said firstfusion protein and wherein providing said second fusion proteincomprises providing a second fusion gene which expresses said secondfusion protein.
 59. The method of claim 57, wherein said second fusionprotein comprises one or more loops.
 60. The method of claim 57, whereinsaid method further comprises (f) determining whether said secondprotein interacts with said first protein inside a cell.
 61. A proteincomprising the sequenceLeu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Leu-Phe(SEQ ID NO: 1).
 62. The protein of claim 61, wherein said protein isconformationally-constrained.
 63. A protein comprising the sequenceMet-Val-Val-Ala-Ala-Glu-Ala-Val-Arg-Thr-Val-Leu-Leu-Ala-Asp-Gly-Gly-Asp-Val-Thr(SEQ ID NO: 2).
 64. The protein of claim 63, wherein said protein isconformationally-constrained.
 65. A protein comprising the sequencePro-Asn-Trp-Pro-His-Gln-Leu-Arg-Val-Gly-Arg-Val-Leu-Trp-Glu-Arg-Leu-Ser-Phe-Glu (SEQ IDNO: 3).
 66. The protein of claim 65, wherein said protein isconformationally-constrained.
 67. A protein comprising the sequenceSer-Val-Arg-Met-Arg-Tyr-Gly-Ile-Asp-Ala-Phe-Phe-Asp-Leu-Gly-Gly-Leu-Leu-His-Gly(SEQ ID NO: 9).
 68. The protein of claim 67, wherein said protein isconformationally-constrained.
 69. A protein comprising the sequenceGlu-Leu-Arg-His-Arg-Leu-Gly-Arg-Ala-Leu-Ser-Glu-Asp-Met-Val-Arg-Gly-Leu-Ala-Trp-Gly-Pro-Thr-Ser-His-Cys-Ala-Thr-Val-Pro-Gly-Thr-Ser-Asp-Leu-Trp-Arg-Val-Ile-Arg-Phe-Leu(SEQ ID NO: 10).
 70. The protein of claim 69, wherein said protein isconformationally-constrained.
 71. A protein comprising the sequenceTyr-Ser-Phe-Val-His-His-Gly-Phe-Phe-Asn-Phe-Arg-Val-Ser-Trp-Arg-Glu-Met-Leu-Ala (SEQ ID NO:11).
 72. The protein of claim 71, wherein said protein isconformationally-constrained.
 73. A protein comprising the sequenceGln-Val-Trp-Ser-Leu-Trp-Ala-Leu-Gly-Trp-Arg-Trp-Leu-Arg-Arg-Tyr-Gly-Trp-Asn-Met(SEQ ID NO: 12).
 74. The protein of claim 73, wherein said protein isconformationally-constrained.
 75. A protein comprising the sequenceTrp-Arg-Arg-Met-Glu-Leu-Asp-Ala-Glu-Ile-Arg-Trp-Val-Lys-Pro-Ile-Ser-Pro-Leu-Glu (SEQ ID NO:13).
 76. The protein of claim 75, wherein said protein isconformationally-constrained.
 77. A protein comprising the sequenceTrp-Ala-Glu-Trp-Cys-Gly-Pro-Val-Cys-Ala-His-Gly-Ser-Arg-Ser-Leu-Thr-Leu-Leu-Thr-Lys-Tyr-His-Val-Ser-Phe-Leu-Gly-Pro-Cys-Lys-Met-Ile-Ala-Pro-Ile-Leu-Asp (SEQ ID NO: 17).
 78. The protein of claim 77, whereinsaid protein is conformationally-constrained.
 79. A protein comprisingthe sequence Leu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Leu-Phe (SEQ ID NO: 18).
 80. The protein ofclaim 79, wherein said protein is conformationally-constrained.
 81. Aprotein comprising the sequenceTyr-Arg-Trp-Gln-Gln-Gly-Val-Val-Pro-Ser-Asn-Trp-Ala-Ser-Cys-Ser-Phe-Arg-Cys-Gly(SEQ ID NO: 19).
 82. The protein of claim 81, wherein said protein isconformationally-constrained.
 83. A protein comprising the sequenceSer-Ser-Phe-Ser-Leu-Trp-Leu-Leu-Met-Val-Lys-Ser-Ile-Lys-Arg-Ala-Ala-Trp-Glu-Leu-Gly-Pro-Ser-Ser-Ala-Trp-Asn-Thr-Ser-Gly-Trp-Ala-Ser-Leu-Ala-Asp-Phe-Tyr(SEQ ID NO: 20).
 84. The protein of claim 83, wherein said protein isconformationally-constrained.
 85. A protein comprising the sequenceArg-Val-Lys-Leu-Gly-Tyr-Ser-Phe-Trp-Ala-Gln-Ser-Leu-Leu-Arg-Cys-Ile-Ser-Val-Gly (SEQ ID NO: 21).
 86. The protein of claim 85,wherein said protein is conformationally-constrained.
 87. A proteincomprising the sequenceGln-Leu-Tyr-Ala-Gly-Cys-Tyr-Leu-Gly-Val-Val-Ile-Ala-Ser-Ser-Leu-Ser-Ile-Arg-Val (SEQ ID NO: 22).
 88. The protein of claim 87,wherein said protein is conformationally-constrained.
 89. A proteincomprising the sequenceGln-Gln-Arg-Phe-Val-Phe-Ser-Pro-Ser-Trp-Phe-Thr-Cys-Ala-Gly-Thr-Ser-Asp-Phe-Trp-Gly-Pro-Glu-Pro-Leu-Phe-Asp-Trp-Thr-Arg-Asp (SEQ IDNO: 23).
 90. The protein of claim 89, wherein said protein isconformationally-constrained.
 91. A protein comprising the sequenceArg-Pro-Leu-Thr-Gly-Arg-Trp-Val-Val-Trp-Gly-Arg-Arg-His-Glu-Glu-Cys-Gly-Leu-Thr(SEQ ID NO: 24).
 92. The protein of claim 91, wherein said protein isconformationally-constrained.
 93. A protein comprising the sequencePro-Val-Cys-Cys-Met-Met-Tyr-Gly-His-Arg-Thr-Ala-Pro-His-Ser-Val-Phe-Asn-Val-Asp(SEQ ID NO: 25).
 94. The protein of claim 93, wherein said protein isconformationally-constrained.
 95. A protein comprising the sequenceTrp-Ser-Pro-Glu-Leu-Leu-Arg-Ala-Met-Val-Ala-Phe-Arg-Trp-Leu-Leu-Glu-Arg-Arg-Pro (SEQ ID NO: 26).
 96. The protein of claim 95,wherein said protein is conformationally-constrained.
 97. Substantiallypure DNA encoding the protein of claim
 61. 98. Substantially pure DNAencoding the protein of claim
 63. 99. Substantially pure DNA encodingthe protein of claim
 65. 100. Substantially pure DNA encoding theprotein of claim
 67. 101. Substantially pure DNA encoding the protein ofclaim
 69. 102. Substantially pure DNA encoding the protein of claim 71.103. Substantially pure DNA encoding the protein of claim
 73. 104.Substantially pure DNA encoding the protein of claim
 75. 105.Substantially pure DNA encoding the protein of claim
 77. 106.Substantially pure DNA encoding the protein of claim
 79. 107.Substantially pure DNA encoding the protein of claim
 81. 108.Substantially pure DNA encoding the protein of claim
 83. 109.Substantially pure DNA encoding the protein of claim
 85. 110.Substantially pure DNA encoding the protein of claim
 87. 111.Substantially pure DNA encoding the protein of claim
 89. 112.Substantially pure DNA encoding the protein of claim
 91. 113.Substantially pure DNA encoding the protein of claim
 93. 114.Substantially pure DNA encoding the protein of claim
 95. 115. A proteinisolated by a method comprising: (a) providing a host cell whichcontains (i) a reporter gene operably linked to a DNA-binding-proteinrecognition site; and (ii) a fusion gene which expresses a fusionprotein, said fusion protein comprising a test protein covalently bondedto a binding moiety which is capable of specifically binding to saidDNA-binding-protein recognition site; (b) introducing into said hostcell a second fusion gene which expresses a second fusion protein, saidsecond fusion protein comprising one of said population of proteinscovalently bonded to a gene activating moiety and beingconformationally-constrained; and (c) measuring expression of saidreporter gene; and (d) isolating a protein based on its ability to alterthe expression of said reporter gene when present in said second fusionprotein.
 116. An interactor protein isolated by a method comprising: (a)providing a reporter gene operably linked to a DNA-binding-proteinrecognition site; (b) providing a first fusion protein, said firstfusion protein comprising a first protein covalently bonded to a bindingmoiety which is capable of specifically binding to saidDNA-binding-protein recognition site; (c) providing a second fusionprotein, said second fusion protein comprising a second proteincovalently bonded to a gene activating moiety and beingconformationally-constrained, said second protein being capable ofinteracting with said first protein; (d) contacting a candidateinteractor protein with said first protein or said second protein; (e)measuring expression of said reporter gene; and (f) isolating aninteractor protein based on its ability to alter the expression of saidreporter gene when present with said first protein or said secondprotein.
 117. A method for detecting a protein in a sample, comprising(a) contacting said sample with a conformationally constrained proteinwhich is capable of specifically binding to said protein and forming acomplex; and (b) detecting said complex.
 118. The method of claim 117,wherein said detecting step is carried out by an immunoprecipitation,Western blot, or affinity column technique that utilizes saidconformationally constrained protein as the complex-forming reagent.119. A method of assaying an interaction between a first protein and asecond protein, comprising: (a) providing said first protein; (b)providing a fusion protein comprising said second protein, said secondprotein being conformationally-constrained; (c) contacting said firstprotein with said fusion protein under conditions which allow complexformation; (d) detecting said complex as an indication of aninteraction; and (e) determining whether said first protein interactswith said fusion protein inside a cell.